in reply to Searching large files before browser timeout

We have an application where we needed to parse 100+ meg log files in real time for various strings and I had the same problem. I found that I was able to get a significant speed boost by offloading the string matching aspect to unix grep and piping the output into perl. It cut the time down from several minutes to under 30 seconds.
$cmd = qq|grep '$string' access.log|; open(LOG,"$cmd|"); while (<LOG>) { # etc..

Replies are listed 'Best First'.
Re: Re: Searching large files before browser timeout
by aijin (Monk) on Jun 14, 2001 at 00:49 UTC
    This works great, thank you! I just benchmarked searching through a smallish file, using Perl pattern matching and grep.

    Benchmark: timing 10000000 iterations of Grep, Perl...
    Grep: 19 wallclock secs (16.38 usr + 0.03 sys = 16.41 CPU)
    Perl: 101 wallclock secs (80.91 usr + 8.11 sys = 89.02 CPU)

    What a difference!

Re: Re: Searching large files before browser timeout
by sierrathedog04 (Hermit) on Jun 13, 2001 at 20:09 UTC
    Having UNIX grep rather than Perl grep do the searching would usually slow your program down. Perl grep is usually faster than UNIX's grep but slower than UNIX's egrep. Of course, YYMV.
      Hmm, well in my case it was blindingly faster. Keep in mind I wasn't searching for anything more complicated than a fixed string (ie: '127.0.0.1') not a regexp or anything of that nature.