Perl and Apache 1.3

Heffstar has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Perl and Apache 1.3 by CountZero (Bishop) on Oct 10, 2009 at 06:09 UTC
If those PDF-files are "static", i.e. they are not being made "on the spot" for every request, it is advisable to let Apache serve them rather than having to go through a perl-script in doing so. Also, if possible, switch to Apache2 and mod_perl 2 which allows you much more flexibility in configuring your application. CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]
Re^2: Perl and Apache 1.3 by Heffstar (Acolyte) on Oct 11, 2009 at 19:14 UTC
In regard to letting Apache server the files, rather than output them, I'm pretty sure the reason we're doing it this way is because we're trying to keep it so certain files are secure and unavailable to other users. If there's an easy way to do this and still have security, please let me know. All of our users access our database and files as a generic web user, not as themselves, if that answers a future question...	[reply]
Re^3: Perl and Apache 1.3 by CountZero (Bishop) on Oct 11, 2009 at 21:36 UTC
It's a long time ago I used Apache 1.3 (long since switched to Apache 2), but IIRC even Apache 1.3 was able to use a basic authentification and authorisation system which can distinguish between users and the files which they have access to. Actually, if you allow your users to run a cgi-script which serves the file, what is the difference with allowing them to access that file directly? Or does the script itself run an authentification/authorisation scheme? CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James	[reply]
Re^4: Perl and Apache 1.3 by Heffstar (Acolyte) on Oct 13, 2009 at 04:31 UTC
Re: Perl and Apache 1.3 by Herkum (Parson) on Oct 10, 2009 at 03:19 UTC
Because there is no hard numbers as we can really do is give you some general rules to watch out for. In this case, the main thing to consider is the time you lose for a having a child which owns a large hunk of memory versus having to reload that child every once in a while. If the child slows down the system(example, because of a paging file), then you should reduce the MaxRequestsPerChild per child until that stops being an issue.	[reply]
Re: Perl and Apache 1.3 by trwww (Priest) on Oct 11, 2009 at 00:25 UTC
We're running into the problem where we've got 20 apache children processes running which keep getting larger and larger as perl outputs our dynamic pages... Then you've got a memory leak. This usually happens in perl when there is a circular reference to a variable and therefore it dosen't get properly refcounted. It can also happen when there is a global variable that keeps getting data added to it (imagine an array that gets data pushed on to it every request. As a quick fix, lower the MaxRequestsPerChild to a low number. Maybe 100... or even 10. Otherwise, you're going to have to find the memory leak. In general, if all of a program's variables are properly scoped then the memory footprint of the program will not continually grow. It will quickly grow to the amount of memory it needs to perform its task, and then level off. all of the code is stored in RAM which is allocated to each apache process If you load code and data in to mod_perl before apache forks its children, then the memory will be shared. How big are the files that make up the application? I can't imagine this part being too much of an issue. Would the size of the file be stored in memory along with the compiled code? How big are the .pdf files? If they are particularly large, you may want to figure out a way for the web app to hand off the .pdf generation to a different, short lived process that will run in its own memory space and return the memory to the OS after it finishes running. After it runs, do a http redirect to have apache directly serve the file. But you definitely have a memory leak if the child httpds are continually growing and growing in memory size.	[reply]
Re^2: Perl and Apache 1.3 by Heffstar (Acolyte) on Oct 11, 2009 at 17:23 UTC
Well, I guess I should specifically say that these processes get larger only when they've got CPU activity associated. That would mean that more compiled code is being loaded into memory associated with the httpd process, right? Also, the associated PDFs are not huge (max 10MB) and the scripts that generate our application are usually around half a MB, but we have one that's about 10MB. Since we're talking a lot of Apache configuration, I'll pose a couple of related questions: MaxRequestsPerChild, when reached, causes the httpd process to terminate and if the load is high enough, another httpd process will start up, correct? Does anyone have any idea how long it would normally take to start up a new process? My boss seems to think that it's up to 20 seconds and that starting a new child is a VERY expensive operation. This server is a fairly decent machine though: 3GHz Xeon Dual Core, 4GB. It runs both Apache and MySQL for a few hundred users, but from what I've read, people who know configuration can get away with much, MUCH less...	[reply]
Re^3: Perl and Apache 1.3 by Heffstar (Acolyte) on Nov 24, 2009 at 16:52 UTC
Basically wrapping this one up should anyone read this thread down the road... I've set the above settings to the following: MinSpareServers 5 MaxSpareServers 10 StartServers 8 MaxRequestsPerChild 2500 MaxClients 100 The biggest change that I made was turning KeepAlive off. What I found was that with KeepAlive turned on, it would have to reach MaxKeepAliveRequests multiplied by MaxRequestsPerChild (2000 * 10000 = 20,000,000) prior to killing off the child process and releasing the memory associated. Just a tad high. Set at 2500, my process size reaches ~100MB then will die off. Also, by turning KeepAlive off, it allowed the server process to immediately serve another client rather than sit around waiting for "KeepAliveTimeout" seconds. Thanks for the help monks!	[reply]