Re: Re: Reading entire file into scalar: speed differences?

Since I'm assuming all the other posters were running under a unix of some sort, and though it might well make a difference, here's some numbers for win32 (win xp, perl 5.6.1, ActiveState build 630):

D:\Documents and Settings\James\Desktop>perl timereads.pl
Benchmark: running join, slurp, sys, while, each for at least 5 CPU se
+conds...
      join:  6 wallclock secs ( 3.92 usr +  1.33 sys =  5.25 CPU) @ 28
+23.36/s (n
=14817)
     slurp:  5 wallclock secs ( 2.39 usr +  2.88 sys =  5.28 CPU) @ 58
+13.53/s (n
=30678)
       sys:  5 wallclock secs ( 0.01 usr +  5.06 sys =  5.07 CPU) @ 11
+.25/s (n=5
7)
     while:  5 wallclock secs ( 4.03 usr +  1.28 sys =  5.31 CPU) @ 28
+34.21/s (n
=15044)
        Rate    sys   join  while  slurp
sys   11.2/s     --  -100%  -100%  -100%
join  2823/s 24998%     --    -0%   -51%
while 2834/s 25095%     0%     --   -51%
slurp 5814/s 51579%   106%   105%     --
[download]

(I used D:\WINXP\system32\oembios.bin (12.5 MB, and a significantly lower proportion of newlines.), and "at least 5 cpu seconds" of time, so these numbers have slightly different bases then belg4mit's.)

Note how terribly sys does in my comparisigns. These are completely different results, in essence, which confuses me a lot.

This is rapidly getting offtopic, but does anybody have a clue why?

TACCTGTTTGAGTGTAACAATCATTCGCTCGGTGTATCCATCTTTG ACACAATGAATCTTTGACTCGAACAATCGTTCGGTCGCTCCGACGC

Comment on Re: Re: Reading entire file into scalar: speed differences? Download Code

Replies are listed 'Best First'.
Re: Re: Re: Reading entire file into scalar: speed differences? by kjherron (Pilgrim) on Jan 25, 2002 at 11:50 UTC
The "sys" method is the only one that has to actually check size of the file. The other methods don't have this overhead; they just read until they get EOF. Perhaps checking the file size is relatively slow on Windows? Another problem with the "sys" method is that it only works on plain files; you can't use it to slurp from a device or pipe, because it won't be able to get an accurate file size. Here's an interesting variation. It uses sysread, but avoids having to fetch the file's size by doing several fixed-size (but large) sysreads in a loop: use Fcntl; use Benchmark qw(cmpthese); cmpthese(1000, { slurp=>sub{ open(SITE, "/usr/share/dict/words"); my $xml; local($/); undef $/; $xml = <SITE>; close(SITE); } , sys=>sub{ sysopen(SITE, "/usr/share/dict/words", O_RDONLY); sysread SITE, my $xml, -s SITE; close(SITE); } , sysby128=>sub{ sysopen(SITE, "/usr/share/dict/words", O_RDONLY); my $xml = ''; while (sysread(SITE, $xml, 1024 * 128, length($xml))) { }; close(SITE); } , sysby256=>sub{ sysopen(SITE, "/usr/share/dict/words", O_RDONLY); my $xml = ''; while (sysread(SITE, $xml, 1024 * 256, length($xml))) { }; close(SITE); } , sysby512=>sub{ sysopen(SITE, "/usr/share/dict/words", O_RDONLY); my $xml = ''; while (sysread(SITE, $xml, 1024 * 512, length($xml))) { }; close(SITE); } } ); [download] I've set up three versions, reading different amounts of data per sysread. My "words" file is around 409k, so the sysby512 trial will actually read the whole file at once (though it will call sysread a second time to discover it's at EOF). Here's the benchmark on an unloaded system: > uname -a Linux linux.local 2.4.16 #4 Mon Dec 10 08:26:03 PST 2001 i586 unknown > perl index.pl Benchmark: timing 1000 iterations of slurp, sys, sysby128, sysby256, s +ysby512... slurp: 9 wallclock secs ( 3.56 usr + 4.57 sys = 8.13 CPU) @ 12 +3.00/s (n=1000) sys: 7 wallclock secs ( 0.17 usr + 5.66 sys = 5.83 CPU) @ 17 +1.53/s (n=1000) sysby128: 7 wallclock secs ( 0.27 usr + 5.87 sys = 6.14 CPU) @ 16 +2.87/s (n=1000) sysby256: 7 wallclock secs ( 0.18 usr + 5.69 sys = 5.87 CPU) @ 17 +0.36/s (n=1000) sysby512: 7 wallclock secs ( 0.16 usr + 5.51 sys = 5.67 CPU) @ 17 +6.37/s (n=1000) Rate slurp sysby128 sysby256 sys sysby512 slurp 123/s -- -24% -28% -28% -30% sysby128 163/s 32% -- -4% -5% -8% sysby256 170/s 39% 5% -- -1% -3% sys 172/s 39% 5% 1% -- -3% sysby512 176/s 43% 8% 4% 3% -- [download] And here's another run, running XMMS (a GUI-based MP3 player) to load the system a bit: > perl index.pl Benchmark: timing 1000 iterations of slurp, sys, sysby128, sysby256, s +ysby512... slurp: 12 wallclock secs ( 4.29 usr + 5.43 sys = 9.72 CPU) @ 10 +2.88/s (n=1000) sys: 8 wallclock secs ( 0.10 usr + 6.88 sys = 6.98 CPU) @ 14 +3.27/s (n=1000) sysby128: 9 wallclock secs ( 0.21 usr + 6.98 sys = 7.19 CPU) @ 13 +9.08/s (n=1000) sysby256: 8 wallclock secs ( 0.25 usr + 6.74 sys = 6.99 CPU) @ 14 +3.06/s (n=1000) sysby512: 9 wallclock secs ( 0.15 usr + 6.70 sys = 6.85 CPU) @ 14 +5.99/s (n=1000) Rate slurp sysby128 sysby256 sys sysby512 slurp 103/s -- -26% -28% -28% -30% sysby128 139/s 35% -- -3% -3% -5% sysby256 143/s 39% 3% -- -0% -2% sys 143/s 39% 3% 0% -- -2% sysby512 146/s 42% 5% 2% 2% -- [download] As you can see, all of the looping sysread methods perform quite respectably compared to the single-sysread method. The sysby512 method actually does better, probably because it avoids having to fetch the file size. If getting the file size is slow on Windows, the performance improvement should be even greater.	[reply] [d/l] [select]
Re: Re: Re: Re: Reading entire file into scalar: speed differences? by theorbtwo (Prior) on Jan 26, 2002 at 07:59 UTC
You seem to be right: getting the file size is an expensive operation on windows. (BTW, this is with a different file, shimgvw.dll, having about the same size as your words. Also, there's a little bit of cheating here -- I found the best blocksize for sysreadby by testing with every power-of-two from 1 to 512k. Also, numbers with open/read vs. sysopen/sysread were about the same, so only the sysfoo number are here.) My final numbers: Benchmark: running join, slurp, sysread_wholefile, sysreadby32k, while +, each for at least 5 CPU seconds... join: 6 wallclock secs ( 2.98 usr + 2.27 sys = .26 CPU) @ 5320 +.53/s (n=27970) slurp: 5 wallclock secs ( 2.55 usr + 2.76 sys = .32 CPU) @ 6339 +.04/s (n=33711) sysread_wholefile: 6 wallclock secs ( 0.39 usr + 4.89 sys = 5.28 CP +U) @ 360.93/s (n=1905) sysreadby32k: 5 wallclock secs ( 2.35 usr + 2.97 sys = 5.33 CPU) @ 6 +879.48/s (n=36647) while: 6 wallclock secs ( 3.07 usr + 2.19 sys = .26 CPU) @ 531 +9.51/s (n=27970) Rate sysread_wholefile while join slurp sy +sreadby32k sysread_wholefile 361/s -- -93% -93% -94% + -95% while 5320/s 1374% -- -0% -16% + -23% join 5321/s 1374% 0% -- -16% + -23% slurp 6339/s 1656% 19% 19% -- + -8% sysreadby32k 6879/s 1806% 29% 29% 9% + -- [download] `TACCTGTTTGAGTGTAACAATCATTCGCTCGGTGTATCCATCTTTG ACACAATGAATCTTTGACTCGAACAATCGTTCGGTCGCTCCGACGC`	[reply] [d/l]