Re: Perl cgi without mod_perl, your experience

Replies are listed 'Best First'.
Re: Perl cgi without mod_perl, your experience by Abigail-II (Bishop) on Jun 22, 2004 at 13:33 UTC
mod_perl is typically 20-40x faster than vanilla CGI. Uhm, that's a bit of useless statement. It may be true in some cases, but it isn't true in other cases. One should consider where the use of mod_perl gets its gains. It gets in gains because it can share some resources, which includes process space and compilation. Another important resource where it can save is sharing of database connections. So, if you have lots of little CGI programs, each doing a quick job, the use of mod_perl can save a lot of resources. If you have CGI programs doing a relatively long jobs (say you have some programs that do custom image manipulation and that process dwarves the time needed for compilation - or you are doing database queries and the time it takes for the queries is far more than starting the Perl program and setting up the database connection), the savings are minimal. So, I'd say the answer to the original question would be more like "NOT ENOUGH INFORMATION - DOES NOT COMPUTE". And my answer to the PHB's first question would be "You haven't heard the whole story - in some cases it will save big time. But if you give me a project code, we can run some tests to see how much it matters. BTW, extra hardware also means more reliability." Abigail	[reply]
Re^2: Perl cgi without mod_perl, your experience by tachyon (Chancellor) on Jun 22, 2004 at 14:18 UTC
While your point it technically valid it is statistically invalid. With the vast majority of interactive websites handled via CGI, mod_perl or something similar is the solution. To be technically correct one would say that you get benfits whenever the startup time (forking an interpreter, connecting to a DB) forms a significant portion of the total runtime. There are relatively few exceptions to this. Downloads and other streams plus long running processing are among those exceptions. It is not a case of some, it is a case of mostly BTW, extra hardware also means more reliability. Rubbish. Extra hardware actually increases the chances of a failure. Think about it..... If the mean time to failure is 700 days and you have 700 servers you will on average have one fall over every day. Extra hardware only provides uptime/reliability protection if you use that hardware to create redundant nodes with automatic failover and to be frank I don't think we are talking that level. If you use efficient code (mod_perl) included you may be able to afford that kind of infrastructure as boxes that would otherwise be working inefficiently can be made to do more work, freeing resources for redundancy. But even the simplest high availability system really needs 4 nodes - a pair out front to create your redundant load balancer and a pair behind to do the work/provide failover. Of course there a lots of other ways to skin that cat depending on how much downtime you can tolerate. Of course caning the hell out of your hardware does not help longevity ;-) cheers tachyon	[reply]
Re: Perl cgi without mod_perl, your experience by Abigail-II (Bishop) on Jun 22, 2004 at 14:49 UTC
Extra hardware actually increases the chances of a failure. Yes, but that's not of mosts peoples interest. It's like saying "I don't do backups, because that could mean that either my hard disk or my tape contains bad spots". While it may increase a failure, it reduces the chance of a critical failure, where a criticial failure means the service you are providing is no longer available (or only available at unacceptable performances). If the mean time to failure is 700 days and you have 700 servers you will on average have one fall over every day. If the mean time between failure is 700 days, and you have one server, you will be down once every 700 days. If you have 700 servers, you will be down every 37036335534589881919519745177905091061529367089546822435775456657617\ 43636878121352291779253462053983059009668861547217195682739117850118\ 35008240379192887792604500837043507056449661590126378834827343300415\ 51155924340365412561936621885141113576008432906355745321587893612547\ 92657179813327520180208828937231810950060232310658708592626955683634\ 89377559706408723518059008437790717245520601634447063767955926579796\ 52663793731051027728096621773894169469654930678654263045798895238772\ 34666615299867665848656245124536507750920588975484100300349256862746\ 40081407312113263209011491753853770009409642000100000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000000000000000000000000000000000000000000000000000\ 00000000000000000000 [download] days. Or if your mean time between failure is 700 days, and it takes a day to recover from a failure, with only two servers you will be down once every 1342 years. Redundant servers work in parallel, and not in a serial configuration. But even the simplest high availability system really needs 4 nodes - a pair out front to create your redundant load balancer and a pair behind to do the work/provide failover. High availability systems don't need load balancers. It's a high availability system - not a load balancing system. All the high availability systems I've worked with, HP's ServiceGuard, Veritas Cluster, SUN Cluster work fine with 2 nodes. Of course there a lots of other ways to skin that cat depending on how much downtime you can tolerate. Oh, yeah, but if you can tolerate downtime, you may be able to tolerate slower service. `;-)`	[reply] [d/l]
Re^2: Perl cgi without mod_perl, your experience by kscaldef (Pilgrim) on Jun 22, 2004 at 17:18 UTC
Re: Perl cgi without mod_perl, your experience by Abigail-II (Bishop) on Jun 22, 2004 at 21:12 UTC
Re^2: Perl cgi without mod_perl, your experience by stvn (Monsignor) on Jun 22, 2004 at 14:18 UTC
If you have CGI programs doing a relatively long jobs (say you have some programs that do custom image manipulation and that process dwarves the time needed for compilation - or you are doing database queries and the time it takes for the queries is far more than starting the Perl program and setting up the database connection), the savings are minimal. While not a silver bullet by any means, the Apache Cleanup handler can be very nice. It is essentially the very last phase the the Apache request cycle, and is actually after the last of the headers have been sent to the client (after the request is over form the users perspective). `$r->register_cleanup(\&my_long_running_sub);` [download] I (ab)use it to generate very large DB-query intensive PDFs on several sites. The hijacked process itself stores its progress in a database, and marks a flag when the PDF is done. All the while the users page has been auto-refreshing at a reasonable interval (and tying up a second apache child :-P ), and once the PDF is done and the flag has been set, they can download it. Sure this can get tricky, since the apache process in a sense becomes "headless" for a while, but with proper exception handling and careful use of `alarm` you can avoid most of the issues that might come up. I would argue too, that this approach is actually more effieient since you will save the cost of module loading and have the benefits of database connection pools and other mod_perl goodies at your disposal. -stvn	[reply] [d/l] [select]
Re^3: Perl cgi without mod_perl, your experience by salvix (Pilgrim) on Jun 22, 2004 at 16:22 UTC
And don't forget to mention that using a reverse proxy will keep your server with a minimal number of mod_perl processes running, i.e., setup an Apache on port 80 (with mod_proxy activated) and redirect every *.pl requests (or all /cgi-bin request, for example) to your mod_perl enabled Apache on another port (81, for instance). You'll be able to handle even more scripts per second because you will keep the mod_perl server busy only to process the request, and the transmission of the output will be a task for your light Apache on port 80, freeing it to handle another script request. HTH.	[reply]
Re^3: Perl cgi without mod_perl, your experience by perrin (Chancellor) on Jun 22, 2004 at 19:31 UTC
That's actually not the best way to do it. Ideally, you would fork so that your processing does not tie up an apache child process at all. This is how we recommend handling long running jobs on the mod_perl list.	[reply]
Re^4: Perl cgi without mod_perl, your experience by stvn (Monsignor) on Jun 22, 2004 at 20:26 UTC
Re^5: Perl cgi without mod_perl, your experience by perrin (Chancellor) on Jun 22, 2004 at 21:30 UTC


good chemistry is complicated, and a little bit messy -LW
	PerlMonks