Re: Fast file searching

Your regex is way too loose and time consuming.

First, why are you using the /g option?

Second, if the timestamp is always at the start of the line and of a consistant format, using a regex that enshrines that information, and doesn't force the regex engine to check things that are unnecessary, will probably speed things up. Something like:

m/^... $MONTH $DAY $HOUR:$MINUTE:..:.. 20..)/
[download]

May help.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco.

Rule 1 has a caveat! -- Who broke the cabal?

Comment on Re: Fast file searching Download Code

Replies are listed 'Best First'.
Re^2: Fast file searching by Fletch (Bishop) on Apr 12, 2005 at 13:43 UTC
Or if your timestamps are "fixed" enough you may be able to use some combination of `unpack`, `substr`, and `split` instead of a regex.	[reply] [d/l] [select]
Re^3: Fast file searching by BrowserUk (Patriarch) on Apr 12, 2005 at 13:55 UTC
Recent enhancements to pack/unpack format and the fact that the format has to be interpreted every time where a regex, at some level is, 'compiled' the first time it is used, mean that pack/unpack are often slower these days. If you use split you are using the regex engine anyway and if you need to use multiple calls to substr, the regex engine will nearly always win. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. Rule 1 has a caveat! -- Who broke the cabal?	[reply] [d/l]
Re^4: Fast file searching by Fletch (Bishop) on Apr 12, 2005 at 15:26 UTC
THEY SLOWED DOWN UNPACK? boggle Aherm. At any rate, as always the answer is benchmark benchmark benchmark. Using this script below to run tests over a ~40M test file, the `substr` version is fastest at ~4 seconds on a dual 1.42G G4 (the others take ~7.25s, ~9.2s, and ~12.7s respectively; all wall clock times, perl v5.8.1-RC3, ruby 1.8.2). #!/bin/zsh echo -n "Making test data . . ." perl -le '$t = time - 5 * 86400 ; for( 1..1_000_000 ) { print scalar l +ocaltime $t, " random " x (int(rand(3))+1); $t += int( rand( 120 ) + +120 ) }' > testlog echo " done" for i in 1 2 3 4 ; do time perl -lne 'print "<b>", substr($_,0,24), "< +/b> ", substr($_,25)' testlog > /dev/null ; done for i in 1 2 3 4 ; do time perl -lne '/^(.{24}) (.)$/; print "<b>", $ +1, "</b> ", $2' testlog > /dev/null ; done for i in 1 2 3 4 ; do time perl -lne '($d,$r)=unpack("A24A", $_);prin +t "<b>", $d, "</b>", $r' testlog > /dev/null ; done for i in 1 2 3 4 ; do time ruby -lne 'print "<b>", $_[0,24], "</b> ", +$_[25,$_.length]' testlog > /dev/null ; done rm testlog exit 0 [download]	[reply] [d/l] [select]
Re^5: Fast file searching by thor (Priest) on Apr 13, 2005 at 00:05 UTC