Re: Break up weblogs

Replies are listed 'Best First'.
Re^2: Break up weblogs by pzbagel (Chaplain) on Aug 09, 2004 at 17:21 UTC
Not only are you proposing the same basic solution the OP had (which means processing the log 40+ times) but in addition you are also proposing they use the system's grep command, which you are almost assured will run slower than Perl's regex engine. If you doubt me, try processing 10 gigs of text file logs using this method...	[reply]
Re^3: Break up weblogs by chromatic (Archbishop) on Aug 11, 2004 at 20:12 UTC
the system's grep command, which you are almost assured will run slower than Perl's regex engine. I'm going to have to put on my Pants of Dubious Benchmarking here and suggest that grep can beat Perl regexes, even in cases like this where Perl might use Boyers-Moore.	[reply]
Re^4: Break up weblogs by Aristotle (Chancellor) on Aug 11, 2004 at 20:26 UTC
And I'm just going to add that `grep` uses a DFA engine, while Perl's regex engine is an NFA. DFAs don't backtrack. It's likely that `grep` will win most matches (pun intended) with nontrivial patterns. (The drawback, of course, is that a DFA does not natively support many of the advanced matching features NFAs can offer, such as backreferences.) Makeshifts last the longest.	[reply]