saintmike has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to find an example where using perlIO really makes a performance difference. To my surprise, when using PerlIO::via::MD5, I found that it is significantly slower than using traditional IO with Digest::MD5:
perlio: 1 wallclock secs ( 1.04 usr + 0.07 sys = 1.11 CPU) @ 62 +.16/s (n=69) regular: 1 wallclock secs ( 0.92 usr + 0.20 sys = 1.12 CPU) @ 19 +4.64/s (n=218)
Given that PerlIO::via::MD5 uses Digest::MD5 under the hood, this is really surprising. Is this just a bad example? Also, if you have examples on how perlIO significantly improved the performance of a task or made your life easier programming it, I'd like to hear about it.

Here's the code for the benchmark, the file examined was 500K in size:

use PerlIO::via::MD5; use Digest::MD5 qw(md5_hex); use Benchmark qw(:all); my $file = "somebigfile.dat"; timethese(-1, { perlio => \&perlio, regular => \&regular, }); sub perlio { open(my $in,"<:via(MD5)", $file) or die "Can't open file for digesting: $!\n"; my $digest = <$in>; close $in; return $digest; } sub regular { local($/) = undef; open FILE, $file or die "Can't open file for digesting: $!\n"; my $data = <FILE>; close FILE; return md5_hex($data); }

Replies are listed 'Best First'.
Re: PerlIO slower than traditional IO?
by Bob9000 (Scribe) on Aug 14, 2005 at 21:07 UTC
    This isn't a problem with PerlIO so much as a serious design flaw in PerlIO::via::MD5. It reads the file one line at a time, forcing PerlIO to call it many times before it completes. Consequently, setting $/ to undef before reading from the handle will make the speeds almost identical, and implementing PerlIO::via::MD5 as a call to &Digest::MD5::addfile makes it faster than your "regular" version.
Re: PerlIO slower than traditional IO?
by tlm (Prior) on Aug 14, 2005 at 19:06 UTC

    Given that PerlIO::via::MD5 uses Digest::MD5 under the hood, this is really surprising.

    My expectations would have been exactly the opposite.

    Also, if you have examples on how perlIO significantly improved the performance of a task or made your life easier programming it, I'd like to hear about it.

    My rule of thumb is that anything that makes the programming easier makes the performance worse, and this rule is right far more often than not. It's a no-brainer: there's a performance price to pay for added the convenience; otherwise no one would program in C or assembler.

    the lowliest monk

Re: PerlIO slower than traditional IO?
by pg (Canon) on Aug 15, 2005 at 01:10 UTC

    First of all your test cases must be simple and focused, so that you compare those two IO methods directly, and don't add anything on top. Otherwise your test cases are not fair.

    I have tested the following two cases on windows xp:

    use strict; use warnings; my $t0 = time(); for (1..10) { open(F, "<:utf8", "noname2.txt"); while (<F>) { ; } close(F); } print time() - $t0;

    And

    use strict; use warnings; my $t0 = time(); for (1..10) { open(F, "<", "noname2.txt"); while (<F>) { ; } close(F); } print time() - $t0;

    Both read a 34M utf file. The traditional one took 8-9 seconds, and the utf8 layer one took 11 seconds. So the one with the utf8 layer is slower as expected.

    Secondly, should you expect the layered PerlIO be faster? No. And other monks have already provided good reasons above.

    Go beyond Perl, I will take Java IO as an example - some sort of opposite example. Java was famous for its "layers" - layered design everywhere. The latest Java versions started to support a rawer IO method, and the IO speed is significantly improved. Was that expected? Yes that was expected.

      Thanks for clarifying that -- guess I was just looking for a good reason to use PerlIO. And performance was the most obvious one.

      So, assuming that the layered design is slower (and also more complex to implement as it seems), what would be a good reason to choose perlIO over the traditional approach? Admittedly, being able to write

      open(my $in,"<:via(MD5)", $file)
      is pretty cool, but I'm not sure I would trade performance or ease of implementation for it.

      Are you saying that modules like PerlIO::via::MD5 or PerlIO::gzip are just academical excercises?

        I see two layers ;-)

        First layer, PerlIO itself. It was mainly designed to make Perl IO more portble. Which is a fine idea, as to be portable was one of Perl's key selling point. However nowadays, lots of languages are portable. So it is questionable whether this is still a string selling point. But being not portable is definitely no good for a language like Perl. So the game rule is like this now: being portable is not unique any more, but being not portable is not a choice.

        Second layer, those layers done for specific purposes. For example PerlIO::gzip etc. If you need it, fine, go use it. But what if they don't exist, who cares.

        Any way, performance was not a design goal for PerlIO.

        Hmm... layers of abstraction.

Re: PerlIO slower than traditional IO?
by spiritway (Vicar) on Aug 15, 2005 at 01:32 UTC

    I've never had a need to benchmark Perl vs. something else, though I imagine there could be something faster.

    However, I think you have to take into account the time needed for development. This is not an insignificant factor, and often outweighs the costs of running more slowly. In the olden days, memory and cycles were quite expensive, so the time spent by programmers to squeeze out a few more bytes or milliseconds was well spent. Now machines are faster, memory is cheap, and the expensive part has become the programmers.

    Often it becomes a question of how much it would cost to develop something a bit faster, vs. how much it costs to run a slower program. In many cases, it's much cheaper to have the program take a little more time, than to try to implement it in a more difficult language.

      All true, but coming back to my question: Why would I use something that's both slower and more complicated (read: expensive) to implement?

        Simple. It isn't more complicated. ;) You gain an easier way to read an MD5 file. The idea that you can decode the file in the open statment is quite nice, because now you can pass that file handle to anything expecting a file handle without those functions needing to worry that it is encrypted. This can be significant, or it can be meaningless. It realy depeneds on your need, if you have a need to gzip or md5 a file that you then want to be able to open and just use the file handle as normal, this is very usefull. If your code is going to look ilke your benchmark then it is 50/50 ;) Thats what makes perl fun, choices.


        ___________
        Eric Hodges
        I don't agree with more complicated. Try to implement each of the filtering layers yourself. Probably each filter will need a totally different approach, whereas the PerlIO are very much the same in use.

        CountZero

        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

        Good question. So, why do you use Perl, as it's both slower and more complicated to implement than C?