Re: Why can code be so slow?
by snowhare (Friar) on May 01, 2007 at 13:46 UTC
|
Other monk's have already addressed the conventional code profiling part of this, so I'm going to chip in on a often overlooked part of performance tweaking for CGI scripts. And that is that even with ideal algorithms, non-persistent CGI is slow.
I'm prefacing this with the benchmarks I did a while ago for various CGI processing packages using Apache2 on a Linux box running on an Athlon XP2100+ processor and Perl 5.8.8 benched with http_load doing 10 parallel fetches for 30 seconds. The scripts was a very simple one that decoded one CGI parameter and just printed its value back to the web browser. The 'null' scripts cheated and just printed a value without bothering to actually read the parameter.
CGI.pm (3.05) via standard CGI - 16 fetches per sec
+ond
CGI::Simple (0.075) via standard CGI - 20 fetches per sec
+ond
CGI::Deurl (1.08) via standard CGI - 36 fetches per sec
+ond
CGI::Thin (0.52) via standard CGI - 38 fetches per sec
+ond
CGI::Lite (2.02) via standard CGI - 52 fetches per sec
+ond
CGI::Minimal (1.16, :preload) via standard CGI - 52 fetches per sec
+ond
CGI::Minimal (1.16) via standard CGI - 66 fetches per sec
+ond
cgi-lib.pl (2.18) via standard CGI - 71 fetches per sec
+ond
null Perl script via standard CGI - 103 fetches per sec
+ond
null C program via standard CGI - 174 fetches per sec
+ond
CGI::Simple (0.075) via mod_perl - 381 fetches per sec
+ond
CGI.pm (3.05) via mod_perl - 386 fetches per sec
+ond
CGI::Minimal (1.16) via mod_perl - 417 fetches per sec
+ond
null Perl script via mod_perl - 500 fetches per sec
+ond
A 'null' perl script that includes no external packages (roughly the same kind of script as yours) executed 103 fetches/second. Using CGI.pm dropped the speed to only 16 fetches/second mostly due to the overhead of its large size.
CGI.pm, by itself, is around 237K bytes of code - and it pulls in Carp (8 Kbytes in perl 5.8.8). Carp then pulls in Exporter (15 Kbytes), Exporter pulls in Exporter::Heavy (6 Kbytes) and Exporter::Heavy pulls in strict (3 Kbytes). If you do a 'use warnings;' that pulls another 16 Kbytes. If you do 'use CGI::Carp;' that will tack on another 16 Kbytes.
So before your script does anything, you very likely will have loaded an additional 300 Kbytes of code just for having done
use strict;
use warnings;
use CGI;
use CGI::Carp;
So you would have limitted the maximum possible speed of your script as a standard (non-mod_perl) CGI to only (adjusting my numbers for the fact your system is about 50% faster than mine judging from the 'null script' speeds) 24 fetches per second. If your own code uses more modules than I've listed, even slower. You mentioned using a 'template library' - Template Toolkit pulls in hundreds of Kbytes with just 'use Template;'. That alone would cut your speed in half again. It can pull in as much as a megabyte of code depending on the features you use - which would drop your speed to under 5 fetches per second.
Vanilla CGI (non-persistent environments) is very slow for scripts that are of any significant complexity in general simply because it takes too much time to compile them and their supporting libraries.
When performance is on the line, if you can, I would strongly recommend using a persistent execution environment (mod_perl or FastCGI for example).
| [reply] [d/l] [select] |
|
CGI.pm, by itself, is around 237K bytes of code - and it pulls in Carp (8 Kbytes in perl 5.8.8). Carp then pulls in Exporter (15 Kbytes), Exporter pulls in Exporter::Heavy (6 Kbytes) and Exporter::Heavy pulls in strict (3 Kbytes). If you do a 'use warnings;' that pulls another 16 Kbytes. If you do 'use CGI::Carp;' that will tack on another 16 Kbytes.
I shouldn't have to say, yet again, that CGI.pm uses a self-loading scheme to avoid compiling everything, so the way you load it makes a tremendous difference.
I will say that in this type of microbenchmark, the contents of @INC and the location of modules in @INC can have a tremendous difference. Perl startup speed can depend greatly on the number of stat calls.
| [reply] [d/l] [select] |
|
True - and irrelevant except to the extent of being able to say "it would be even worse than it is actually measured performance-wise if it didn't do that."
The test script for CGI.pm consisted of this:
#!/usr/bin/perl
use CGI;
my $cgi = CGI->new;
my $value = $cgi->param('a');
print "Content-Type: text/plain\015\012\015\012a=$value\n";
Note that I do *NOTHING* not necessary for the script to execute.
If you can suggest a faster way than to use CGI.pm in a script than that, I would be fascinated to know what it is. | [reply] [d/l] |
|
|
|
|
|
|
| [reply] |
|
Damn! I wish I could upvote this node 100 times.
At last. Some real numbers for cgi stuff. Devoid of emotion or prejudice.
The only addition I would like to see added to your benchmarks is to repeat your 4 mod_perl tests using FastCGI.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
You show that CGI is slow, and using CGI.pm is slow. That's nothing new.
However, you make unjustified conclusions based on that. You immediately point at the code size, but then get off track very quickly. You say strict.pm is 3k of code. Well, it's about 3k characters, almost all of it Pod. That's the same situation for the other modules. You're being at best ingenuous there.
You are really making a proxy conclusion for code size mapping directly onto processing time (more code is more things to parse and run), although anyone here can write a very small routine to take up all of your CPU for the rest of its existence. You've made a prediction, but you didn't do anything to verify it (such as making null classes or fiddling with %INC). You don't do what you should do next: profiling. All you know from your analysis is the run time. There is nothing in there to point at a cause.
Also, you get too caught up in the numbers. Those are only valid for your particular machine and setup. Not only are the numbers valid only in your case, but so are their relationships. I tried the same thing and got much better performance from a C script, as well as better relative numbers between a null Perl script and one that uses CGI.pm. A different situation gives different numbers, which is why benchmarks only really matter if you run them on the target platform.
Finally, you get to what most people found out in the 90s: CGI and CGI.pm is slow. So what? It's often fast enough for the task, even when the scripts do real work.
| [reply] |
Re: Why can code be so slow?
by jbert (Priest) on May 01, 2007 at 09:17 UTC
|
OK, so your system is capable of running something like 150 perl CGI requests/sec, for a CGI which does very little (the printenv.cgi).
So the difference is in what your code does, so you need to work out what it is doing which is slow and then think about whether that is reasonable amount of time for that work and if you can speed it up if not. (If your code downloads 1Mbyte from a remote site and you are bandwidth limited then tweaking your code won't really help).
Measuring which bits of your code take time is called profiling. There are some good perl modules to help with this, e.g. Devel::Profile and Devel::DProf. These are easier to use on a command line application than a CGI, so one thing you might want to do is to first get to a stage where you can run your CGI as a command line app.
If your CGI doesn't rely on any parameters to demonstrate the slowness, you can try simply: perl my.cgi < /dev/null > output.html. If this reproduces the slowness, then you can try using the profiling tools to work out where it is spending it's time.
Profiling is a bit of an art, and it is very easy to read the results wrongly, so I'd always recommend using more than one approach or tool, to cross-check your data. In particular, simply using Time::HiRes and adding some logging with hires timing to your app can help show where time is being spent.
If your application isn't CPU bound (and it's quite likely not to be), then profiling may be of less help. In this case, you need to work out what your code is doing. The strace utility can be good for this, but again, can require some expertise to interpret. You can turn on timestamps from the strace output which can help you work out where the time is going.
And as a last thought (and perhaps the thing to try first), if your code is connecting to other servers you could be getting hurt by connection times, DNS lookup times etc. You could try time ping some.host to get an idea of the time taken for a process on your box to resolve 'some.host' and get a network packet to and from the remote host. If ping doesn't work for you, you can always use time wget some://url.on.that.host if you are fetching a remote HTTP page, say. | [reply] [d/l] [select] |
Re: Why can code be so slow?
by varian (Chaplain) on May 01, 2007 at 09:44 UTC
|
jbert has provided some excellent pointers. In addition to those remarks, your exclamations on extreme long response times suggests this might also be a case of memory shortage/page swapping on the server.
Some ideas:
- investigate system memory usage (page swapping?)
- benchmark a single, non-concurrent- request
- in general modperl may improve perf. dramatically
- if needed you could offload part of the application to another server to split up service requests between servers
| [reply] |
|
A problem much like this was solved at my work location by having the cgi open a network socket to a daemon running on a dedicated application server and offloading the real work there so the web server could continue processing requests. While the user was waiting (somtimes for up to a minute) we displayed a little "progress bar" that didn't really represent anything other than time elapsed. Seemed to placate our 2500+ strong user base, all of whom had to use the application at least once to accomadate an infrastructure upgrade.
| [reply] |
|
A few slowdowns to even lockdowns I've noticed at a few older scripts of me is at moments when there is a high memory drop, lost of connection to a certain NAS drive, network timeouts etc.. which loads up more childs than the poor parent could support.
I've been creating an error-library only for that to keep up with such errors, best to keep code working clean is also intercepting the eventual problems in a clean way. Be prepared coding for any error and the code will be prepared for you...
After installing the error library I've noticed no other lockups; because I was prepared, even for doom scenarios which happened once when the power supply of the NAS drive failed continuesly spawning children by the dozen ... Bugs Bunny would have been jealous that time!
| [reply] |
|
I am for sure going to "profile" this problem with the tools I've read above and will keep you posted if I get any results back; because I think this problem might be hitting every cgi programmer at a certain time...
| [reply] |
Re: Why can code be so slow?
by halley (Prior) on May 01, 2007 at 13:05 UTC
|
Without the code, it's hard to tell. I understand that some situations like licensing won't allow for publishing the code, though size and complexity shouldn't be barriers for those interested in working with friends.
In my experience, whenever it's a matter of some code taking 10x the time I thought it would take, there are only two culprits:
- hardware/resource bottleneck
- misunderstood order of work: e.g., O(n) where you thought O(1)
Hardware bottlenecks are thankfully pretty easy to measure: tools like top or PerfMon can show you if you're paging unexpectedly high amounts to RAM or the network.
For your algorithm, review the contents of every loop in your code to see if there's something that's going to be O(n) when you initially thought it was O(1). In a loop, that becomes O(n*m) instead of the O(m) you expect of a loop.
After understanding your algorithm's order of work or complexity, it's time to reduce each iteration. If anything is O(n) and called twice, ask yourself why. If you're passing big data by value instead of reference, ask yourself why.
As a last tip, consider doing your benchmarking work outside the extra overhead of the web server and/or database, if possible. It can open your investigation to the use of more tools, more iterations, and more focus on your part. It will be good for your code to be decoupled from those system dependencies, regardless.
-- [ e d @ h a l l e y . c c ]
| [reply] [d/l] [select] |
Re: Why can code be so slow?
by scorpio17 (Canon) on May 01, 2007 at 13:02 UTC
|
Without using mod_perl, every request is going to spawn a new copy of the perl interpreter (in memory), then load/compile/run your scipt. You're probably just running low on memory, and forcing the machine to do lots of swapping (juggle data from memory to disk and back). So I'd suggest a) add as much ram to the server as you can afford (you can't have too much), and b) try to get mod_perl working. This will give you one instance of the perl interpreter (in memory) and precompile your script (saving may be small for just one hit but will add up if you have lots of hits at the same time).
| [reply] |
|
I've been checking the memory usage with TOP; but 2gigabytes seem to be ok (only 30% in use); it's more CPU dependent (100% in use with AB); I would say this application is for sure more memory than CPU optimized.
| [reply] |
Re: Why can code be so slow?
by swares (Monk) on May 01, 2007 at 17:00 UTC
|
You said you were using apache 1.3?
I had terrible performance problems with Perl scripts under IHS 1.3 (based on Apache 1.3)... switching to IHS 2.x (Apache 2.x) significantly improved performance for my Perl scripts.
I believe IHS is just a repackaged Apache with some IBM specific modules, so you may find it to be an improvement also.
| [reply] |
Re: Why can code be so slow?
by Moron (Curate) on May 01, 2007 at 11:37 UTC
|
I would suggest putting the code on the site - profiling might not be the best first step to take ... for example, if there are ways we can suggest to improve the maintainability then that should come earlier in the critical path, given that performance is a maintenance issue.
Update: or if it's just a matter of not having used mod_perl because you have a pile of non_mod_perl code, see Apache::Registry for a possible solution.
__________________________________________________________________________________
^M Free your mind! | [reply] |
Re: Why can code be so slow?
by dragonchild (Archbishop) on May 02, 2007 at 02:49 UTC
|
Given that we have nothing about your code other than the CGI script names, I can't say for certain if this is a problem. But, in my experience, 90% of the bottlenecks in 90% of all Perl webapps backed by a database is the misuse of the database. If you are backed by a database, a few things to look at:
- Are your queries using indices? An index can make the difference between 2 req/s and 200 req/s. (I have personally seen a 99% speed improvement several times.)
- Are you using BLOBs to store files? If you are, get them out of the database and onto a NAS. Databases are horrible for storing binary files, such as pictures or audio, and should be used only as a last resort. Better to store the filenames in the database and the files on disk.
- Are your tables properly normalized? I don't mean have you followed the rules of normalization to the 5th degree. I meant "Are your tables properly normalized for your needs?" I selectively denormalize for performance on a regular basis, but only after I've properly normalized everything.
- Are you executing only those queries that are necessary? This is a Java example, but I once saw a 94% improvement in speed in specific screen in a Struts app by properly configuring Hibernate to not pull the sub-rows for the master query.
My criteria for good software:
- Does it work?
- Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
| [reply] |
|
I am using XML files because everything needs to be universally exportable to Flash and certain other applications where XML would be the easiest/universal solution.
For meta information of a file I use inode.xml ; for settings for that current folder I use default.xml file etc.. I made the system that way I would never read more than 2 XML files maximally ; except when reading a full directory where I'd normally extract the metadata from the files I'd extract that info from XML files. That way I can also cross-export this info quick-and-easy to any other applications.. atleast.. I thougth so ;)
My XML usage is sure not meant as database but rather as easy export/format for other systems working together with the data.
| [reply] |
|
In other words, you're hand-coding any searches you might need to do and iterating through datastructures that you create every time using an XML parser. And you're wondering why your system is slow? If you profile, I'll bet that the problem you're running into is in one or both of these places:
- Your XML parser is fast, but your usage isn't. 10-to-1 you're using the Tree option of XML::Parser, which is the slowest option.
- Once you have this data in memory, you're not working with them correctly. Algorithm choice is the second largest factor in runtime performance (after database (mis)usage).
Also, if you're reading larger XML files (say, for your default.xml), then perl has to allocate RAM for the data structures. Depending on your OS, this could be up to 10% of your runtime.
The proper solution, in case you're wondering, is to use some sort of database and export to XML as needed. So, use either something like MySQL (if you want a RDBMS and are comfortable with it) or a DBM solution could be DBM::Deep or BerkleyDB.
In other words, solve the problem at hand first, then extend the solution. Baby steps.
My criteria for good software:
- Does it work?
- Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
| [reply] |
Re: Why can code be so slow?
by derby (Abbot) on May 01, 2007 at 13:01 UTC
|
| [reply] |
Re: Why can code be so slow?
by talexb (Chancellor) on May 02, 2007 at 13:26 UTC
|
I'm coming late to the party, but I can report a performance improvement I accomplished yesterday under Apache 2.2. Using ModPerl::Registry, I reduced the turnaround time for a simple request from about 2.2 seconds to about 125ms. That's for a CGI::Application system accessing a MySQL database.
I'm pretty happy with that improvement.
Finally, upgrading your software can be quite an investment in your time, but you're more likely to be able to get help when you are using a more up to date version of software. Something to consider.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds
| [reply] |
|
As soon as I am fully done, out of my featurefreeze and fixed most bugs I will probably move to Apache2 and more even to Mod_Perl2 ; probably with Apache::Registry to start with (any reasons why I shouldn't?)
Still now it would need to be increased in performance; it cannot be Apache 1.3 gives this pisspoor performance with Perl ; I would be more happy if I'd be getting like 16 or more hits / second instead of under 1 request / sec. ;)
If I'd get 16r/sec I could probably get 10x more with mod_perl; so I can better get it better now and optimize it even more with mod_perl and checking my algorithms for any "hidden delays"..
| [reply] |
|
#!/usr/bin/perl
use warnings;
use strict;
use Time::HiRes;
my $hr_log;
BEGIN {
open($hr_log, ">", "/tmp/hrlog")
or die "can't open hi-res log";
sub hr_logger {
my ($secs, $msecs) = Time::HiRes::gettimeofday();
print $hr_log "$secs.$msecs: ", join("|", @_), "\n";
}
hr_logger("BEGIN");
}
# Obviously, you'll have all your modules here
use CGI;
# Put this just before you actually do any work in your code
# i.e. after all your 'use' lines
hr_logger("Ready to run");
# This bit would be your app
print "Now get on with running the app\n";
# And put this before you exit, for completeness
hr_logger("finished");
exit 0;
which on my system produces the output:
1178122211.615440: BEGIN
1178122211.646999: Ready to run
1178122211.647111: finished
In my case above, the 'use CGI' time dominates, because I'm doing nothing in the app.
If you do the same, but sprinkle a few more calls to hr_logger in your code, then you should be able to work out what is slow. From what you've posted so far, it sounds as though you are parsing a lot of XML.
If you need to do this parse per-request, then mod_perl won't speed you up much (and you'll need to look into optimising your XML use or replacing it with something else). If you only need to do this parse on startup, then mod_perl (or FastCGI) would help you.
It's not really a case of getting this optimised and then getting another speed boost by moving to mod_perl. What mod_perl does is save you *startup costs*. Profiling and optimising your existing code will save your per-request costs. They're pretty indpendent really and you want to know which is hurting you since you don't want to waste your time optimising the areas which aren't hurting you. | [reply] [d/l] [select] |
Re: Why can code be so slow?
by 2xlp (Sexton) on May 03, 2007 at 02:53 UTC
|
use mod_perl;
Probably the biggest hit you're getting is the overhead of the CPAN modules. The Template and DB abstraction layer / rdbms systems are all huge. File::Find is similarly big . You'll get a giant speedup from caching those into memory under some sort of persistent environment.
When you get to production, Apache is also pretty slow/bad. There are far more efficient options now.
My best advice would be to install nginx on port 80 to handle static content, and proxy all dynamic content to your mod_perl app on an alternate port.
Run mod_perl2 + Apache2 on port 8000. Use the registry module on mp2, not the mod_perl handler module -- you should have a fairly transparent migration.
Don't expect to get 16x more performace off mod_perl -- mp will speed you up, but you're still going to suffer from performance bottlenecks at the db level.
Stroring data in xml is a bad decision. You should store it in the DB in some sort of normalized format, then export it to xml or whatever you want as you need it. you're probably doing a lot more parsing and using a lot more diskspace than you should. | [reply] |