dd-b has asked for the wisdom of the Perl Monks concerning the following question:

Still working on performance of my picpage module, as discussed here.

Since profiling seemed to say most of the time was taken up loading modules, it looks like mod_perl would be a big help, right? And Apache::Registry should buy me that performance improvement? Since it caches scripts once compiled? (Yes, I know I probably get a different apache process on each connect, and I'll have to get the script loaded into all of them before I'll see a consistent performance improvement.)

Little improvement. With almost no changes the script functioned under Apache::Registry, but with only about a 10% performance improvement.

Then I added a perl startup script to force preloading of several of the big modules I used in the script (straight out of Writing Apache Modules with Perl and C, with the names of the modules I wanted). Still no major performance improvement.

Then I removed a couple of modules I had better alternatives to in Apache::Util (so making the script specific to mod_perl, no longer able to run as straight CGI). Functions correctly, but still no particular performance improvement.

So I'm now trying to run a BIG batch of runs of the script, to make sure that it's been called in each httpd process (I'm doing 2x the number of processes I'm running; of course this doesn't guarantee hitting every single one), and then benchmark after that. Waiting for that to run as I type this...and it's done, and the performance is about where I started. My very initial CGI version was taking about 3.5 seconds per image according to ab, and this one is taking 2.7. The initial analysis showed module loading taking 2.5 seconds, YAML reading taking .5, and nothing else as much as .01. This isn't enough of an improvement.

Clearly the next step will be profiling within the script as run, and see where that time goes. But it's really weird that eliminating the module load time hasn't made any difference. I'm sure I'm running through mod_perl rather than CGI -- it's not in a scriptalias directory and doesn't have extension CGI, so on my server it wouldn't execute at all if the config to run it through mod_perl wasn't right. And using Apache::Util works, which it wouldn't if I weren't running through mod_perl.

Any other suggestions?

  • Comment on CGI to Apache::Registry, or to mod_perl

Replies are listed 'Best First'.
Re: CGI to Apache::Registry, or to mod_perl
by perrin (Chancellor) on Jan 17, 2004 at 00:27 UTC
    It's hard to tell what's going on without seeing code, but if loading modules was taking any time at all it would be completely gone when you run under mod_perl. You're right that Apache::Util shouldn't work under CGI, but can you just check the value of $ENV{MOD_PERL} anyway?

    The other possibility is that the time is being spent on something else. Maybe your script just spends lots of time writing files or something. There's nothing mod_perl can do about that.

      $ENV{MOD_PERL} gives me "mod_perl/1.26", so I really really am running under mod_perl. I completely sympathize with your impulse to ask for this additional confirmation; it's behaving strangely and user error should be considered.

      It has occurred to me that I haven't benchmarked Apache serving a similar-sized file. Now, the ab output breaks out transfer time, and the transfer time is a totally trivial part of the total. But it doesn't break out other overhead, like URI mapping and all that. Best I get myself a datapoint on that!

      Okay, did that; just serving pages, Apache is using about .02 seconds. So that's not a significant part of the 2.6 seconds it's taking for my picpage to run.

      My initial message on the topic, pointed to by the initial message in this thread, gave some profiling numbers showing that nothing except loading modules and reading the YAML config file took "any" time (and loading modules took 5 times more than loading YAML config).

        I think there's something going on besides just loading modules that you aren't seeing with your homemade profiling. I'd say it's time to get out the big guns and run Devel::DProf, or maybe Devel::SmallProf in this case since you don't really have any subs at all in your script. Alternatively, you could run the script in the debugger, and maybe you'll discover that it's doing some things you didn't realize.
Re: CGI to Apache::Registry, or to mod_perl
by toma (Vicar) on Jan 17, 2004 at 05:33 UTC
    Are you are benchmarking Apache on the first run of the mod_perl program? It only gets faster the second time the program is run by a particular http daemon. If you have many http daemons running, you will have to do many requests before you notice your program suddenly running faster.

    Also, you might try reducing the number of print statements in your code. I think that perl's IO system has changed somewhat since I wrote Re: Speeding up commercial Web applications, but I think that the advice there is still good.

    There are a few other optimizations that can improve print statements, but the improvements are small. Things to experiment with are:

    • Eliminate string interpolation. This is about a 1% improvement in my version of perl.
    • Use commas to separate print arguments instead of concatenating them as strings with the . operator.

    Another mod_perl optimization is to move as much computation as possible to a BEGIN {} block, so it is only computed during the first run of your code. Similarly, put cleanup code in an END {} block.

    It is easy to get confusing answers when profiling mod_perl. I use Time::HiRes and print the profiling information to STDERR. This is tedious but I suspect that the more deluxe profiling modules were not written with mod_perl in mind!

    The ultimate test is load testing, which I do with a bunch of lwp GET requests running at the same time. With this I discover other Apache parameters to tune, such as the number of requests to service before restarting the http daemon. These server restarts require perl to reload the modules.

    It should work perfectly the first time! - toma

      Thanks for all your help and patience. I think I know what set of things has been going on, and now know where the time is going.

      All of you will, I'm sure, be shocked, just shocked, to learn that inadequately controlled benchmarking is the cause of my confusion. I was using files in a couple of different directories to test against, and the YAML configuration file in one of the directories is 5k bytes, and in the other is 35k bytes. I happen to have switched between them in just the right pattern to make CGI and mod_perl versions perform about the same overall.

      Some Time::HiRes based internal profiling is what finally allowed me to nail it down. I will, however, point out that I'd done some of that for my initial message, the previous thread where I first asked about this. (Sorry, I'm getting a bit overloaded on suggestions to do things I've already said I've done. But doing some *more* of it did resolve the problem, so maybe I shouldn't complain about that particular case anyway!)

      There's no longer a mystery about where the time goes. However, now that I really understand what's going on, the total amount of time taken is still pretty unsatisfactory. Which leads to my next thread, Big config file reading, or cross-process caching mechanism