JimDuyer has asked for the wisdom of the Perl Monks concerning the following question:

Hello fellow perl worshipers. I have followed the helpful advise of many here, and created a script that works well. The good news is that I have 6000 unique users per month now, and the script has been working fine for three months. The bad news? is that the number of users is doubling every month or so. My question is, other than using my timer system that only allows use every 30 seconds from any one IP, and flock, and strict, and trimming down the script, I'm getting concerned for the future. My question, finally, is this - would it help to split my script into one main one that handles the form input and checks it for errors/abuse, and two or three other new scripts that handle other parts of the task, using require otherscript.pl for example ? Does splitting the tasks out (there are three main ones) help to make it keep working while it grows ? I don't have any sql or use many modules, its basically take in text input, arrange it, and output gif images that equate to the text. (Its an English-Mayan Glyph translator online at http://www.event12.com) However, I have some upgrades and will be using GD and GD::Text in the new version, instead of feeding up images, and I will change from using an alphabet of 26 characters to using an 84 character syllabary instead, so the GD will probably add some to the mix. Help please ? Suggestions are most welcome and usually implemented.

Replies are listed 'Best First'.
Re: Good News/Bad ?
by moritz (Cardinal) on May 14, 2011 at 18:09 UTC

    Splitting your script only helps if the startup time of the script hurts, and splitting it means that you need to load fewer modules in each script.

    Here are some general tips for speeding up CGI scripts:

    • Use a profiler to identify the slow parts; that way you don't waste your time speeding up things that are already reasonably fast (you will, without profiling)
    • Cache whatever you can, avoid recomputations. Cache::Cache is easy to use, for larger scaling memcached might be interesting
    • Save startup time by using persistent processes (FastCGI/mod_perl/others);
    • Generate some pages statically (that's a form of caching too)
    • If you use a relational database, check if all necessary indexes are in place
    • Place some ads on your pages and use the income to pay for better hardware
Re: Good News/Bad ?
by davido (Cardinal) on May 14, 2011 at 17:29 UTC

    This is a CGI script?

    mod_perl is good for high volume sites because it doesn't require the Perl interpreter to restart with each new request. However, there is some difference in how you write scripts that target mod_perl. In particular, variables should be lexically scoped. But that's not the only difference. You've got a little time before things get critical; get a good book on mod_perl and wade through it. You'll find that if you programmed your script thoughtfully the conversion may not be too difficult.

    If there is database access, there is a module in Apache2 that can handle your database requests in a persistent process too. I'm not sure how good Perl's support is for this feature, but it's available.

    Be sure to read the Apache documentation with respect to performance tuning. Likewise for database tuning if you're using database access.

    And then profile your script. See if there are any bottlenecks to address.

    Breaking it up into smaller chunks probably won't make a huge impact unless you're dealing with many thousands of lines of code. I would seek other bottlenecks first.


    Dave

      Thanks for those tips. I will get a book on mod_perl and see what I need to do. There is no database, so that part is good. I use a hash table that I worked up instead, as there are only 92 or so tr's needed. I don't know how much time I have as I almost have the syllabary done. I'll be using GD to write the text, using a ttf font of my own making. That will cut out the images loading, but really it runs quick as it is. Just had to use GD because I will be using the GD::Text::Arc module as well. I finally got it to work with my font, after pulling out some hair... Thanks for the tip on not breaking it up. Can you tell me how to best profile the script to check for bottlenecks ? Putting in some print to screens of where It is at from time to time ? Sorry, still a new kid at Perl.
        You'll find the books on mod_perl are out of date and it's hard to get started with the current version.

        However, everything I've seen recently says that mod_perl is deprecated in favor of FastCGI, with PSGI the new heir apparent.

        So go straight to PSGI.

Re: Good News/Bad ?
by chromatic (Archbishop) on May 14, 2011 at 17:25 UTC

    What's your execution model? Vanilla CGI? mod_perl? FastCGI?

    What kind of hosting do you have?

    How has your resource usage scaled with the number of users?

      Thanks for replying, I use Hawk Host, shared, Linux, plain vanilla cgi perl. The script is 500 lines, never used or tried to use fastcgi or mod_perl before, I will get a book on them and see how they work, I guess.
Re: Good News/Bad ?
by Anonymous Monk on May 15, 2011 at 01:06 UTC
    I'd say you can go along time yet without becoming too worried. 6,000 users per minute might be a little different...