Re: Your second thought.
It's a fair point, but the stats show comparative differences which means that only the first pass of the file is penalised, as the file will then be in the cache for all subsequent passes.
In the case of the figures shown, the case affected was File::ReadBackwards (by virtue of Benchmark running the testcases in alpha sorted order by name). As File::ReadBackwards managed to process the file at least 700+ times in the alloted 3 seconds of cpu, regardless of the filesize, the affect of the penalty for putting the file into the cache on the first pass is minimal. However, to preclude the possibility of any affect, I added the following line at the top of the for loop
( undef ) = do{ local $/; open my $fh, '< :raw', $file or die $!; <$fh> };
so as to preload the cache. The results of the re-run were nearly identical--certainly within the bounds of normal variance.
P:\test>354830 Comparing data/500k.dat Rate Tie::File readfwd File::ReadBackwards + rawio Tie::File 5.15/s -- -95% -99% + -100% readfwd 94.4/s 1734% -- -88% + -99% File::ReadBackwards 783/s 15101% 729% -- + -95% rawio 15058/s 292394% 15852% 1824% + -- Comparing data/1000k.dat Rate Tie::File readfwd File::ReadBackwards + rawio Tie::File 2.50/s -- -94% -100% + -100% readfwd 43.7/s 1650% -- -95% + -100% File::ReadBackwards 871/s 34777% 1893% -- + -94% rawio 14917/s 597126% 34023% 1612% + -- Comparing data/2MB.dat Rate Tie::File readfwd File::ReadBackwards + rawio Tie::File 1.24/s -- -95% -100% + -100% readfwd 23.6/s 1797% -- -98% + -100% File::ReadBackwards 1051/s 84542% 4363% -- + -93% rawio 14894/s 1198858% 63119% 1316% + --
There's no real magic about why the differences are so great. Tie::File and the readfwd cases are having to read the entire file to get the last line. Additionally, Tie::File is doing a huge amount of work under the covers with buffering the whole file through a limited buffer space and a hash. This extra work is incredibly useful when you are using it for the purposes for which it was designed, but this is the wrong purpose.
File::ReadBackwards skips to the end of the file and (unsurprisingly:) reads backwards in a similar fashion to the rawio case, but it carries the overhead of tie. It also is properly coded to handle the IO in a cross platform manner and handle any length of line rather than relying on a hardcode maximum line length and assuming that "\n" will do the 'right thing' as my crude rawio case does.
For production work where performance wasn't the ultimate criteria, I would use File::ReadBackwards in preference to trying to fix up the rawio case.
In reply to Re: Re: Reading from the end of a file.
by BrowserUk
in thread Reading from the end of a file.
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |