Basic history is that we take CSV flatfiles into Excel, run a pivot table on them, and use the results to make neat graphs. As these are firewall logs, you can imagine that the data is varied and numerous. Frequently we wind up with over 65k records for a month however, which is Excel's limit - so we have to export many CSV's and manually combine pivot table results. This is very labor intensive, and my thinking is that PERL is the right weapon to tackle this.

The example file in question is approximately 1.7 million records in length (about 140mb), with each row having about 7-8 fields. I'm torn between trying to handle this all in PERL's structures somehow, or importing it into a database and doing something with it there. I'm also unsure of the best way to go about actually counting the values (this is where perhaps a DB search function might be useful, and fast?). Some examples of what we'd look for include, "Top 10 source IP's" or "Number of blocked sessions by date". Each record would include a date/time, IP info typical to a firewall log, attack severity, etc.

Any input here would be most appreciated. The data structure isn't complex, but processing it efficiently and quickly may take some creative thinking. I also have to consider that a database may get rather large, since this is just one file of about 30 I would need to process each month, and hopefully keep around for a little while for historical reasons. Thanks!


In reply to Creative sorting and totalling of large flatfiles (aka pivot tables) by dbg

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.