in reply to Break up weblogs

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re^2: Break up weblogs
by pzbagel (Chaplain) on Aug 09, 2004 at 17:21 UTC

    Not only are you proposing the same basic solution the OP had (which means processing the log 40+ times) but in addition you are also proposing they use the system's grep command, which you are almost assured will run slower than Perl's regex engine.

    If you doubt me, try processing 10 gigs of text file logs using this method...

      the system's grep command, which you are almost assured will run slower than Perl's regex engine.

      I'm going to have to put on my Pants of Dubious Benchmarking here and suggest that grep can beat Perl regexes, even in cases like this where Perl might use Boyers-Moore.

        And I'm just going to add that grep uses a DFA engine, while Perl's regex engine is an NFA. DFAs don't backtrack. It's likely that grep will win most matches (pun intended) with nontrivial patterns. (The drawback, of course, is that a DFA does not natively support many of the advanced matching features NFAs can offer, such as backreferences.)

        Makeshifts last the longest.