Now explain your benchmark. :-)

When I answered before I knew full well that any of the three could win, depending on OS, installed versions, hardware, files, etc. The reason why cat wins here is latency. In doing IO, every so often you may wind up waiting for your request to get sorted. Well with the pipe you can let cat do that waiting, and Perl can go on its merry way.

This has to be weighed against the fact that it takes more work to launch cat than it does to open a filehandle. Plus operating systems take some pains to do for every process what cat does for one. So the tradeoff is highly system specific.

The third option, slowest for you by a country mile, can win on very large files. Why? Well it turns out that Perl is faster to read STDIN than arbitrary filehandles. The third option arranges for Perl to be using STDIN. This has to be weighed against the fact that it takes a lot more work for Perl to be launched than cat.

Therefore in the right time and place, any of the three can win on raw speed.

But you should definitely go with the second. No doubt about it.

Why you ask?

Well it is the most portable answer, and with the second you can check failures and $! is populated correctly. This key information has been lost for the other 2. Besides which if you really ran out of performance, by using the second and then naively parallelizing by running a fixed number of copies on different files, you would get the best overall throughput.

There is exactly one circumstance where I have, or would, recommend something different. If you are on a system where Perl does not have large file support but cat does (this is now a compile-time option for Perl, but some systems may still fit that description) then the first option will allow Perl to work on files of size over 2 GB.

So the summary is that any of the three can win on raw performance, but for portability and error checking you really want to use the native method. (Which is the prioritization that I hinted at above. But you should not need to know all of this, that prioritization is usually right in the end.)

Any questions?


In reply to Re (tilly) 2: cat vs. file handle speed? by tilly
in thread cat vs. file handle speed? by dorpus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.