michellem has asked for the wisdom of the Perl Monks concerning the following question:

I've got a fairly complex CGI script - a search engine that searches both html files in a directory, plus tables in a database. In any event, it turns out it takes up lots of server memory when it runs, and I'm looking for ways to have it not be such a hog. Are there easy ways to "nice" CGI scripts, or do I just really need to go through the code and find better ways to do what it is I need done?

Replies are listed 'Best First'.
Re: "niceing" a CGI script
by AgentM (Curate) on Mar 13, 2001 at 06:14 UTC
    In fact, if you want to limit the amount of resources your script eats, you'd want to use nice, but considering that you probably want to SPEED UP the script, you have a few options:
    • migrate the entire search mechanism to the database
    • unbuffer script output through server, so client gets something while waiting for next results (gives illusion of speed)
    • use ReiserFS for that FS speed boost
    • use smart cacheing, perhaps the easiest option to implement
    • use the unbuffering I mentioned above in congruence with Bone::Easy to insult the user away from waiting for further results

    Good luck!

    AgentM Systems nor Nasca Enterprises nor Bone::Easy nor Macperl is responsible for the comments made by AgentM. Remember, you can build any logical system with NOR.
Re: "niceing" a CGI script
by footpad (Abbot) on Mar 13, 2001 at 10:47 UTC

    Update: Clarified and fixed typos.

    Without more of an idea of what your script actually does, there's little to go on, save the ol' crystal ball method. Based on that, here's one idea out from left field.

    AgentM alluded to this, but I thought I'd spell it out a bit more clearly. You might also take a look at the work your script is doing and, if possible, off-load as much as possible to other processes, especially if you're preparing the index before searching it.

    For example, some search engines manually go through each file when invoked. This is not a great idea, unless you only have few documents. It bogs the system and wastes cycles repeating processes that don't need to be repeated. If your material highly dynamic (it changes often), that may be different (depending on the way you've implemented it), so I'll keep it simple and assume that it's mostly static...

    In your case, you should be able to come up with an indexing script that creates a universal index to your HTML documents and your database tables. You can rebuild the index manually simply by running the script from the command-prompt or you can have it rebuilt automatically through a cron job (which is generally adviseable, for it means you don't have to remember to run the indexer after adding content).

    Your CGI users then run a second script that searches the "meta-index" and prepares the hit list, if any, of matches using data stored in your meta-index.

    In general, it's always adviseable to review your code for those elements that can be streamlined, however, that's not always possible. Even if you're working under a deadline, however, there's generally time to go back (after delivery) and tweak things later. In my experience, users generally want changes of some form and those give you the opportunity to clean things up a bit, if you're careful.

    --f

Re: "niceing" a CGI script
by dash2 (Hermit) on Mar 13, 2001 at 07:30 UTC
    One day, you'll be able just to
    use less "memory";

    ;-)

    Until then, the Benchmark module will help you work out what is slowing things down.

    dave hj~

Re: "niceing" a CGI script
by tomhukins (Curate) on Mar 13, 2001 at 16:23 UTC

    If your OS supports it, you could use setpriority from BSD::Resource.

Re: "niceing" a CGI script
by Jonathan (Curate) on Mar 13, 2001 at 16:45 UTC
    If you haven't already you may want to review your code for a any performance/greed gotchas. The Camel has a useful section on efficiency. In my experience 'nice'ing jobs just prolongs the hit, especially if there are database connections where prolonged locks can badly affect your concurrency.