First off, figure out how fast you could possibly go, given the amount of data you have:
sub checkReadTime # call this with just your list of files { my $linecount = 0; my $starttime = time; for my $file ( @_ ) { my $fh = &openLogFile($file) or next; while ( <$fh> ) { $linecount++; } close $fh; } my $endtime = time; warn sprintf( "read %d lines from %d files in %d sec\n", $linecount, scalar @_, $endtime - $starttime ); }
The difference between the duration reported there and the duration of your current app is the upper bound on how much better you might be able to do.

Apart from that, if you have a serious problem with how long it's taking, maybe you should be doing more with the standard compiled unix tools that do things like sorting. For instance, you could put your filtering step into a separate script, pipe its output to "sort", and pipe the output from sort into whatever script is doing the rest of the work on the sorted lines.

If your app isn't geared to a pipeline operation like this:

filterLogFiles file.list | sort | your_main_app
then just put the first part of that into an open statement inside your main app:
open( my $logsort, "-|", "filterLogFiles @files | sort" ) or die $ +!; while ( <$logsort> ) { ... }
Of course, you can include option args for your filterLogFiles process so that it can skip the lines that you don't need depending on dates or whatever, and have it output the data in a form that will be easiest to sort (and easy for your main app to digest).

In reply to Re: How to improve speed of reading big files by graff
in thread How to improve speed of reading big files by korlaz

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.