in reply to Why can code be so slow?

Other monk's have already addressed the conventional code profiling part of this, so I'm going to chip in on a often overlooked part of performance tweaking for CGI scripts. And that is that even with ideal algorithms, non-persistent CGI is slow.

I'm prefacing this with the benchmarks I did a while ago for various CGI processing packages using Apache2 on a Linux box running on an Athlon XP2100+ processor and Perl 5.8.8 benched with http_load doing 10 parallel fetches for 30 seconds. The scripts was a very simple one that decoded one CGI parameter and just printed its value back to the web browser. The 'null' scripts cheated and just printed a value without bothering to actually read the parameter.

CGI.pm (3.05) via standard CGI - 16 fetches per sec +ond CGI::Simple (0.075) via standard CGI - 20 fetches per sec +ond CGI::Deurl (1.08) via standard CGI - 36 fetches per sec +ond CGI::Thin (0.52) via standard CGI - 38 fetches per sec +ond CGI::Lite (2.02) via standard CGI - 52 fetches per sec +ond CGI::Minimal (1.16, :preload) via standard CGI - 52 fetches per sec +ond CGI::Minimal (1.16) via standard CGI - 66 fetches per sec +ond cgi-lib.pl (2.18) via standard CGI - 71 fetches per sec +ond null Perl script via standard CGI - 103 fetches per sec +ond null C program via standard CGI - 174 fetches per sec +ond CGI::Simple (0.075) via mod_perl - 381 fetches per sec +ond CGI.pm (3.05) via mod_perl - 386 fetches per sec +ond CGI::Minimal (1.16) via mod_perl - 417 fetches per sec +ond null Perl script via mod_perl - 500 fetches per sec +ond

A 'null' perl script that includes no external packages (roughly the same kind of script as yours) executed 103 fetches/second. Using CGI.pm dropped the speed to only 16 fetches/second mostly due to the overhead of its large size.

CGI.pm, by itself, is around 237K bytes of code - and it pulls in Carp (8 Kbytes in perl 5.8.8). Carp then pulls in Exporter (15 Kbytes), Exporter pulls in Exporter::Heavy (6 Kbytes) and Exporter::Heavy pulls in strict (3 Kbytes). If you do a 'use warnings;' that pulls another 16 Kbytes. If you do 'use CGI::Carp;' that will tack on another 16 Kbytes.

So before your script does anything, you very likely will have loaded an additional 300 Kbytes of code just for having done

use strict; use warnings; use CGI; use CGI::Carp;

So you would have limitted the maximum possible speed of your script as a standard (non-mod_perl) CGI to only (adjusting my numbers for the fact your system is about 50% faster than mine judging from the 'null script' speeds) 24 fetches per second. If your own code uses more modules than I've listed, even slower. You mentioned using a 'template library' - Template Toolkit pulls in hundreds of Kbytes with just 'use Template;'. That alone would cut your speed in half again. It can pull in as much as a megabyte of code depending on the features you use - which would drop your speed to under 5 fetches per second.

Vanilla CGI (non-persistent environments) is very slow for scripts that are of any significant complexity in general simply because it takes too much time to compile them and their supporting libraries.

When performance is on the line, if you can, I would strongly recommend using a persistent execution environment (mod_perl or FastCGI for example).

Replies are listed 'Best First'.
Re^2: Why can code be so slow?
by chromatic (Archbishop) on May 01, 2007 at 18:40 UTC
    CGI.pm, by itself, is around 237K bytes of code - and it pulls in Carp (8 Kbytes in perl 5.8.8). Carp then pulls in Exporter (15 Kbytes), Exporter pulls in Exporter::Heavy (6 Kbytes) and Exporter::Heavy pulls in strict (3 Kbytes). If you do a 'use warnings;' that pulls another 16 Kbytes. If you do 'use CGI::Carp;' that will tack on another 16 Kbytes.

    I shouldn't have to say, yet again, that CGI.pm uses a self-loading scheme to avoid compiling everything, so the way you load it makes a tremendous difference.

    I will say that in this type of microbenchmark, the contents of @INC and the location of modules in @INC can have a tremendous difference. Perl startup speed can depend greatly on the number of stat calls.

      True - and irrelevant except to the extent of being able to say "it would be even worse than it is actually measured performance-wise if it didn't do that." The test script for CGI.pm consisted of this:
      #!/usr/bin/perl use CGI; my $cgi = CGI->new; my $value = $cgi->param('a'); print "Content-Type: text/plain\015\012\015\012a=$value\n";
      Note that I do *NOTHING* not necessary for the script to execute. If you can suggest a faster way than to use CGI.pm in a script than that, I would be fascinated to know what it is.
        True - and irrelevant...

        It's extremely relevant, if Perl doesn't actually compile all of that code. If your point was that Perl has to find and read all of those blocks from disk, that's fine. I didn't get that impression from your notes, however.

        If you can suggest a faster way than to use CGI.pm in a script than that, I would be fascinated to know what it is.

        Make sure CGI.pm is in the first directory in @INC.

        To do this benchmark properly, make sure all of the modules you want to load are in the same directory, and preferably the first directory in @INC. If you really want to compare the weight of one module over another, you have to remove all other variations, and disk IO can be a very large variation, especially if compilation and execution time is minimal as in this case.

        This makes me wonder if I better keep using my own routines to get my parameters etc instead of using CPAN modules. Do these modules take their entire size as overhead or only the subroutines which get executed? I've also noticed a few weird things with a package I've made slurping more CPU (and memory) than it should take; I only use XML::Twig and XML::Simple in that package (for backwards compatibility for now)

        Visual::XML::BEGIN is which I really wonder about ; only 3 calls and 17.5 time for executing these; while its merely a package of 3 routines made as proxy to be backwards compatible using XML::Twig with XML::Simple calls...

        %Time ExclSec CumulS #Calls sec/call Csec/c  Name
         42.0   0.120  0.188     16   0.0075 0.0118  Visual::XML::Podcast::BEGIN
         17.5   0.050  0.278     10   0.0050 0.0278  main::BEGIN
         17.5   0.050  0.060      3   0.0166 0.0199  Visual::XML::BEGIN
         3.50   0.010  0.010     47   0.0002 0.0002  strict::bits
         3.50   0.010  0.010      1   0.0100 0.0100  utf8::AUTOLOAD
         3.50   0.010  0.010     29   0.0003 0.0003  XML::Twig::BEGIN
         3.50   0.010  0.010      3   0.0033 0.0033  LWP::Simple::BEGIN
         3.50   0.010  0.010      1   0.0100 0.0100  Visual::XML::Podcast::itunes::BEGIN
         3.50   0.010  0.010      7   0.0014 0.0014  IO::File::BEGIN
         0.00   0.000 -0.000     31   0.0000      -  strict::import
         0.00   0.000 -0.000      1   0.0000      -  vars::BEGIN
         0.00   0.000 -0.000      1   0.0000      -  warnings::BEGIN
         0.00   0.000 -0.000      3   0.0000      -  warnings::register::import
         0.00   0.000 -0.000      6   0.0000      -  warnings::register::mkMask
         0.00   0.000 -0.000     29   0.0000      -  vars::import
        
      I will say that in this type of microbenchmark, ...

      Instead of denigrating other peoples efforts to produce objective measurements, why not (help) produce a better benchmark?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Why can code be so slow?
by BrowserUk (Patriarch) on May 01, 2007 at 14:02 UTC

    Damn! I wish I could upvote this node 100 times.

    At last. Some real numbers for cgi stuff. Devoid of emotion or prejudice.

    The only addition I would like to see added to your benchmarks is to repeat your 4 mod_perl tests using FastCGI.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Why can code be so slow?
by Anonymous Monk on May 05, 2007 at 14:53 UTC
    You show that CGI is slow, and using CGI.pm is slow. That's nothing new.

    However, you make unjustified conclusions based on that. You immediately point at the code size, but then get off track very quickly. You say strict.pm is 3k of code. Well, it's about 3k characters, almost all of it Pod. That's the same situation for the other modules. You're being at best ingenuous there.

    You are really making a proxy conclusion for code size mapping directly onto processing time (more code is more things to parse and run), although anyone here can write a very small routine to take up all of your CPU for the rest of its existence. You've made a prediction, but you didn't do anything to verify it (such as making null classes or fiddling with %INC). You don't do what you should do next: profiling. All you know from your analysis is the run time. There is nothing in there to point at a cause.

    Also, you get too caught up in the numbers. Those are only valid for your particular machine and setup. Not only are the numbers valid only in your case, but so are their relationships. I tried the same thing and got much better performance from a C script, as well as better relative numbers between a null Perl script and one that uses CGI.pm. A different situation gives different numbers, which is why benchmarks only really matter if you run them on the target platform.

    Finally, you get to what most people found out in the 90s: CGI and CGI.pm is slow. So what? It's often fast enough for the task, even when the scripts do real work.