Recent enhancements to pack/unpack format and the fact that the format has to be interpreted every time where a regex, at some level is, 'compiled' the first time it is used, mean that pack/unpack are often slower these days.
If you use split you are using the regex engine anyway and if you need to use multiple calls to substr, the regex engine will nearly always win.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.
Rule 1 has a caveat! -- Who broke the cabal?
| [reply] [d/l] |
THEY SLOWED DOWN UNPACK? *boggle* Aherm. At any rate, as always the answer is benchmark benchmark benchmark.
Using this script below to run tests over a ~40M test file, the substr version is fastest at ~4 seconds on a dual 1.42G G4 (the others take ~7.25s, ~9.2s, and ~12.7s respectively; all wall clock times, perl v5.8.1-RC3, ruby 1.8.2).
#!/bin/zsh
echo -n "Making test data . . ."
perl -le '$t = time - 5 * 86400 ; for( 1..1_000_000 ) { print scalar l
+ocaltime $t, " random " x (int(rand(3))+1); $t += int( rand( 120 ) +
+120 ) }' > testlog
echo " done"
for i in 1 2 3 4 ; do time perl -lne 'print "<b>", substr($_,0,24), "<
+/b> ", substr($_,25)' testlog > /dev/null ; done
for i in 1 2 3 4 ; do time perl -lne '/^(.{24}) (.*)$/; print "<b>", $
+1, "</b> ", $2' testlog > /dev/null ; done
for i in 1 2 3 4 ; do time perl -lne '($d,$r)=unpack("A24A*", $_);prin
+t "<b>", $d, "</b>", $r' testlog > /dev/null ; done
for i in 1 2 3 4 ; do time ruby -lne 'print "<b>", $_[0,24], "</b> ",
+$_[25,$_.length]' testlog > /dev/null ; done
rm testlog
exit 0
| [reply] [d/l] [select] |
I like the Benchmark module for such comparisons. Also, wallclock times can be a misleading metric. There's nothing preventing your thread of execution from being underprioritized for a really long time, thus inflating the wallclock time and leading you to draw false conclusions. More commonly, you'll get inconsistent wallclock times between runs, but consistent CPU times. I'm all about consistent results, and thus will use CPU time every time.
thor
Feel the white light, the light within
Be your own disciple, fan the sparks of will
For all of us waiting, your kingdom will come
| [reply] |