Re: Your second thought.

It's a fair point, but the stats show comparative differences which means that only the first pass of the file is penalised, as the file will then be in the cache for all subsequent passes.

In the case of the figures shown, the case affected was File::ReadBackwards (by virtue of Benchmark running the testcases in alpha sorted order by name). As File::ReadBackwards managed to process the file at least 700+ times in the alloted 3 seconds of cpu, regardless of the filesize, the affect of the penalty for putting the file into the cache on the first pass is minimal. However, to preclude the possibility of any affect, I added the following line at the top of the for loop

( undef ) = do{ local $/; open my $fh, '< :raw', $file or die $!; <$fh> };

so as to preload the cache. The results of the re-run were nearly identical--certainly within the bounds of normal variance.

P:\test>354830 Comparing data/500k.dat Rate Tie::File readfwd File::ReadBackwards + rawio Tie::File 5.15/s -- -95% -99% + -100% readfwd 94.4/s 1734% -- -88% + -99% File::ReadBackwards 783/s 15101% 729% -- + -95% rawio 15058/s 292394% 15852% 1824% + -- Comparing data/1000k.dat Rate Tie::File readfwd File::ReadBackwards + rawio Tie::File 2.50/s -- -94% -100% + -100% readfwd 43.7/s 1650% -- -95% + -100% File::ReadBackwards 871/s 34777% 1893% -- + -94% rawio 14917/s 597126% 34023% 1612% + -- Comparing data/2MB.dat Rate Tie::File readfwd File::ReadBackwards + rawio Tie::File 1.24/s -- -95% -100% + -100% readfwd 23.6/s 1797% -- -98% + -100% File::ReadBackwards 1051/s 84542% 4363% -- + -93% rawio 14894/s 1198858% 63119% 1316% + --

There's no real magic about why the differences are so great. Tie::File and the readfwd cases are having to read the entire file to get the last line. Additionally, Tie::File is doing a huge amount of work under the covers with buffering the whole file through a limited buffer space and a hash. This extra work is incredibly useful when you are using it for the purposes for which it was designed, but this is the wrong purpose.

File::ReadBackwards skips to the end of the file and (unsurprisingly:) reads backwards in a similar fashion to the rawio case, but it carries the overhead of tie. It also is properly coded to handle the IO in a cross platform manner and handle any length of line rather than relying on a hardcode maximum line length and assuming that "\n" will do the 'right thing' as my crude rawio case does.

For production work where performance wasn't the ultimate criteria, I would use File::ReadBackwards in preference to trying to fix up the rawio case.


Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail

In reply to Re: Re: Reading from the end of a file. by BrowserUk
in thread Reading from the end of a file. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.