legLess has asked for the wisdom of the Perl Monks concerning the following question:

Monks ~
I've been teaching myself CGI and DBI programming with Perl, and I have questions about speed. I know some of these are very general, and that often the only answer is to benchmark using live code and data on the production server (or a facsimile thereof), but perhaps some of them are a priori obvious.
  1. Database access: I'd guess that connecting to and querying a database is far slower than any Perl-only operation. Correct?
  2. Modules: It would seem to me that every module loaded will slow the program a little due to disk reads: is this true? If so, is the speed hit worth the extra cleanliness you can get from user-created modules?
  3. Modules & Apache: Or does Apache cache modules to increase speed? If so, it still wouldn't cache user-created modules, right?
  4. OO: Object-oriented code, other things being equal, is slower than non-OO code, correct? (Realizing that sometimes OO can allow you to produce a more elegant and faster solution).
  5. SSI: Is there an appreciable speed difference between (a) using static HTML with SSI to 'exec' scripts and (b) generating the page 100% in Perl? I'd guess that one Perl script is faster that an HTML page with >$X 'includes' or 'execs'. But what's $X? Is "it depends" the only real answer?
TIA

Replies are listed 'Best First'.
Re: Perl CGI and SSI speed
by toma (Vicar) on Jun 17, 2001 at 02:02 UTC
    Speed can refer to:
    1. The number of web pages per second that you can display without causing the server to slow down excessively.
    2. The amount of time that the user waits for content to appear.
    To fix speed type #1 you should use mod_perl where needed. Usually, based on an analysis of your traffic flow, you will need performance improvements for only some of your CGIs.

    To fix speed type #2 there are many tricks, many of which do not seem to be widely used. Here are a few:

    • Optimize the size of image files, for example by reducing the number of colors in a gif file or the amount of compression in a jpeg.
    • Include the size tags for the image files so that the browser can do the layout before loading the image.
    • Set $|=1 so that your script sends output as soon as it is ready.
    • Your CGI should call print two or three times. This is because print is *very* slow. People tend to have many more print statements in a CGI, and they are often the cause of performance problems. The way to avoid calling print too often is to buffer up a bunch of output into a string and print that. I have seen print cause more performance problems than perl objects.
    • Your first CGI print statement should generate enough content in the browser to get the reader started. The reader won't really care if the rest of the load takes a few seconds longer.
    • Don't use layouts with tables nested to many levels inside of tables, at least in the first print statement. This takes a long time for the browser to render. Most browsers won't render anything at all for a table until the close of the outermost table.
    • Don't use a lot of little graphic buttons. Try to keep graphics down to three images, up to seven is reasonable. Any more should be somewhere in your web page where waiting for them won't annoy the reader.
    • Disk reads should not be a problem for loading modules. Any file that is often-used will get cached by any decent operating system. Loading large perl modules can be slow for other reasons, but this is fixed by using mod_perl.

    To make your code friendly to porting to mod_perl, you should:

    • use strict;
    • Don't declare any variables outside of subroutines in the modules that you write.
    • Leave time in the schedule for lots of testing and fixing things in your mod_perl program. Mod_perl is worthwhile as a performance advantage, but it requires more work to do correctly. It is much less forgiving of sloppy coding practices than CGI.
    • Use a computer than can hold lots of RAM. You will be able to make speed/memory tradeoffs in mod_perl, until you run out of sockets for more RAM. In my last project I was able to justify lots of RAM because it was the cheapest way to increase performance.
    • Make sure your web server is Apache running on Linux or some sort of Unix.
    There are many, many more tricks for making a fast dynamic web site. It is an interesting topic and I hope that you build speedy ones!

    It should work perfectly the first time! - toma

Re: Perl CGI and SSI speed
by dimmesdale (Friar) on Jun 17, 2001 at 00:22 UTC
    Well, there's no one answer for each situation...it all varies(as with most things in life).

    1)I'm not sure what you mean? Like a $dbh->connect() call verse an open()? Certainly a connect() call is faster than a sleep(1000) call, the latter of which is perl-only, so it just depends.

    2)Well, somewhat true. There are different loading methods that can be implemented to load-on-demand, so to speak. However, using a module usually means that its been tested and it usually anticipates more conditions than you would, so its usually the better choice(I do use usually, b/c as I said, it all varies; though if neccessary, you can edit a module down to size, to fit your purposes).

    3)I don't follow here...if you're using mod_perl? Then you can have Apache cache the modules upon loading up your machine, an it won't require extra time accross different script invocations. And user-created modules are no different from standard modules--at least to Apache.

    4)Well, again, guess what? It depends. Mostly, it depends how the module was designed. For instance, CGI.pm, according to CGI Programming with Perl is slower if non-OO methods are used, so OO methods are better. Bottom line: Benchmark.

    5)YES! Static HTML pages are *much* faster(well, generally). This is because to create a CGI script requires creating a separate proccess on the server's machine. . . this takes time. Using mod_perl, fast cgi, etc., can take away some of this time, but on general static HTML is faster(quite a bit, usually).

    Bottom line to a long post: It all depends, so BENCHMARK THE CODE. Situations change/vary, things can be true for one thing, exceptions arise, etc. Cold, hard rules are hard to come by in way of efficiency.

Re: Perl CGI and SSI speed
by Masem (Monsignor) on Jun 17, 2001 at 01:06 UTC
    Most of 1-4 can be answered by using some persistent environment to run your CGI scripts in (read: mod_perl for apache!). With that, you only read each module in once (per server) and you only need only persistent DBI connection. If you can run mod_perl, by all means do so, but do realize that if you have existing CGI scripts, you'll probably need to whittle at them a bit to make them mod_perl friendly.

    As for CGI and SSI, I would use SSI on mostly static pages, particularly those that are not results of any previous CGI (eg, for a quote of the day, or the like). Anything else should be done with CGI and a template solution; in most cases, the template module can be cached if you use mod_perl, so you gain a further advantage.


    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
Re: Perl CGI and SSI speed
by Aighearach (Initiate) on Jun 17, 2001 at 01:25 UTC

    >2. Modules

    This speed hit is only taken if you don't preload your modules. All used modules should be loaded in your apache startup. Then, they will be compiled only once. The memory (mostly) doesn't even need to be copied, it uses shared memory with copy-on-write, so it is very very fast. There is not a speed difference in loading a pre-loaded Perl module in mod_perl versus loading a C library. However, this can use a lot of RAM. I've had sites where there were two apaches, one for mod_perl, another for everything else, the mod_perl version used 6 megs per process, the other used 1 meg. It was very fast for everything that way.

    >3. Modules & Apache

    Yes, it caches them. It evals them. Don't use mod_perl until you're read the docs long enough to understand what this does to the namespace. Really.

Re: Perl CGI and SSI speed
by Anonymous Monk on Jun 17, 2001 at 15:02 UTC
    1. Yes, Database reads are always a bottleneck

    2. Perl actually compiles into memory the code to be run, so the extra disk reads will only effect the startup speed of the script. After it has started the modules will only take a bit more memory, but not noticably.

    3. I am unsure, but i do not believe Apache does anything with standard modules such as caching, but mod_perl is emadded so that does allow a speed increase.

    4. Strictly Speaking, OO code is slightly slower. The function call syntax ($object->function()) takes a bit more time than the equivlant finction($object). The diffrence is very small, but does exist.

    5. I know very little about SSI, but exec'ing any perl code means the Interpreter has to load into memory, read, parse, and compile the script, and then execute it. The process creation overhead of exec-like calls should make SSI slower, but i may be mistaken.
Re: Perl CGI and SSI speed
by steveAZ98 (Monk) on Jun 18, 2001 at 07:05 UTC
    All good questions:

    1. Database accesses do slow down a script, DBI has a lot of overhead. You can increase your performance with a database specific driver (i.e. one written using the database api) but you'll loose portability.

    2. In my opinion the slight overhead that modules incur is well worth the maintainability of the code, machine performance wise might not be the best solution, but manpower performance wise it is definitely worth it to use modules.

    3. Apache and mod_perl cache perl modules, whether or not they are user created, a module is just a module. I beleive that scripts are even cached if they are run under Apache::Registry

    4. OO code is usually a bit slower than procedural. I tend to make small scripts procedural and larger projects OO, although it sometimes just makes sense for the small scripts also.

    5. Straight HTML is faster. Any other combinations you should benchmark yourself (since there are just way too many ways to combine perl and SSI to give a straight answer). If you run apache use ab, it's a great tool to help you figure out what is happening.

    Other notes. use the Benchmark module and Devel::DProf to see what really effects the performance of your scripts and ab (apache benchmark) for your web comparisons.

    HTH
Re: Perl CGI and SSI speed
by feloniousMonk (Pilgrim) on Jun 18, 2001 at 19:13 UTC
    --
    Just my 2 cents:
    1. DB access - Sure, the DB stuff is usually slower.
    What I try to do is minimize DB work and let Perl handle
    as much as possible to gain the best performance

    2. Modules - As mentioned before regarding Apache and
    loaded modules, depends. But as a rule of thumb look
    at what the module does for you - usually if you're not
    comfortable/experienced enough to write your own module
    it's worth using the existing mod even with performance
    hit. (The assumption here is that if you do not know
    Perl or the problem area well enough to
    write a module for your problem, then it's safe
    to say the module author may have handled something
    you would have missed.)

    I'm not trolling here, just saying that if you don't
    know it well enough yet trust the module developer and
    learn from module code. I've learned a lot
    this way, and not just in Perl either.

    3. No useful comment.

    4. Typically, based on what I've heard, seen
    written, and benchmarked myself, OO code TYPICALLY
    is slower BUT there are cases where OO is faster
    on a module-per-module basis. Depends how it was
    written, really.

    Plus, is it faster to write code in OO or Non-OO?
    I'm a C/Perl guy so for me it's non-OO typically but
    depends on problem faced and your mindset (How do you
    logically break down the problem?)

    Since I was mostly a C guy, it's hard to unlearn
    the procedural methods and see things as their
    *usually* natural Objects and methods

    5. SSI - Is it faster to buy a pre-packaged lunch or
    to buy a roll and put cold cuts, condiment, etc. on it?

    Static HTML is pre-packaged, canned, and dynamic is your "custom" sandwich.

    But sometimes it's better to wait the extra time
    so you have your sandwich
    "Just Right" (i.e., worth the extra process time)

    Hope my rant helps, I know I feel better already.
    -felonious