Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Speeding up commercial Web applications

by PotPieMan (Hermit)
on May 08, 2003 at 01:56 UTC ( [id://256443]=perlquestion: print w/replies, xml ) Need Help??

PotPieMan has asked for the wisdom of the Perl Monks concerning the following question:

I have been assigned the unfortunate task of improving the performance of a commercial Perl application. We have the source to this application, and are free to make modifications to it. However, we do not want to make very many major modifications, in hopes of being able to upgrade to new versions of the product in the future.

To give you a general idea of the issue, the application takes approximately 10 seconds to generate each page. Between 3 and 5 seconds is an "acceptable" load time, so there's a lot of work to do. I've made optimizations where possible, trying to improve the application's use of the database, but I've only shaved off a few milliseconds here and there. Again, I've limited myself becase I want to be able to upgrade later.

The overriding problem is that the application feels like it was written for Perl 4. By that I mean that there are lots of global variables, many local variables, and definitely no use strict. That makes running the application in mod_perl fairly difficult, if not impossible. Apache::Registry definitely seems to be out. Apache::PerlRun looks like my best bet. I've run across Good place to learn Apache::PerlRun, but given that the application feels like Perl 4 code, I am not very hopeful.

Finally, FastCGI and SpeedyCGI are probably out of the question because our systems group won't support either.

Do you think I have any options? I would prefer not to have to recommend a new hardware purchase.

Thanks,
-Daniel

Replies are listed 'Best First'.
Re: Speeding up commercial Web applications
by tachyon (Chancellor) on May 08, 2003 at 04:00 UTC

    6-7 seconds processing is an awesome amount of time unless your machine is dog slow. First have a look with top to see if you are running out of RAM and getting into the swap space this will really kill you for speed. If so add more RAM (its cheaper than hacking time). Perl loves RAM and so do DBs for that matter. You really can't throw too much RAM at either.The more RAM a DB has available the more data and indexes it will cache. Most of my servers have at least 2GB.

    You must be doing a lot of post processing on the DB output to take 6-7 seconds. Don't. Sorting, Grouping, Joining, Selecting are all tasks that should be done at the DB level (fast and in C) not the application level (slower in Perl). This of course may entail major recoding. If you are only displaying X results then add a LIMIT clause so you only get that many rows - quicker to get, less to post process for a minor change.

    You must have a lot of loops to be consuming this much time. Rewrite them to either exit as soon as they can ie:

    # slow print "Found" if grep{ /$find/ } @list # always faster and uses less memory for (@list) { do{ print "Found"; last } if /$find/; }

    or preferably do it at DB level as noted.

    On the subject of loops you can spend memory and gain speed by building hash lookup tables rather than iterating over lists.

    If you are doing sorts that can't be done on the DB with an ORDER BY clause you may need a Schwartzian Transform depending on the application.

    If you are calling external processes or including dynamic content you will get benefit from caching static results, updating them as required, and using these.

    PerlRun is an option but all it really does is save a bit of time on the compile and load which is apparently not your issue given the time breakdown. It will help and is the simplest option if the code will run OK under it.

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Though I agree about the RAM thing, using DBMS functions to process the data is not always faster. I've found SELECT DISTINCT to be, on occasion, many times slower than feeding the output of a simple SELECT into a hash.

      This may have been an unusual case. I was retrieving distinct values of a single column from a table with about 125,000 rows from Empress, which isn't that much of a mainstream system. Still, I seem to remember using the Perl hash was more than 20x faster that letting Empress do the SELECT DISTINCT thing.

      --
      bowling trophy thieves, die!

Re: Speeding up commercial Web applications
by Abigail-II (Bishop) on May 08, 2003 at 02:10 UTC
    Without knowing anything about the application, it's very hard to say what to do. If of the 10 seconds it takes, it spends 9.5 in the database, don't bother speeding up the code - speed up your database. If it spends a lot of time doing I/O, get faster disks, a better controller, or more memory. If you are sure it's the code, find the bottleneck(s), and rewrite that, perhaps in C.

    Or buy a better product.

    Abigail

      I apologize for not describing the profile. Out of the ~10 seconds, the breakdown is something like the following:
      • 1 second on startup - database connection, etc.
      • 1-2 seconds on database work
      • 6-7 seconds on "processing"

      It's the "processing" step that causes so much trouble, since we can't make major modifications to the code.

      And we have definitely thought about buying a better product, or writing one in house. Ugh.

        You can probably get it to work under PerlRun. That would completely eliminate the startup time. It won't help with the processing time though. For that, you really have to profile things and change code. The only generic piece of advice I can offer is to run perl 5.6.1 compiled with no threads, since it seems to have better performance than later perl versions.

        Incidentally, FastCGI and SpeedyCGI have the same limitations as mod_perl when running poorly written code. PerlRun does a better job than either of them do at trying to make things work.

        Well, then you can save at most 1 second by changing the program to be both mod_perl (or something similar) aware, and by using some database connection pooling mechanism. You might be able to save a little on the database work by using different indices or a different layout. But that also saves you at most 2 seconds.

        The majority of the possible savings will be from changing the code. But you say you can't make major modifications. There isn't a magical wand you can wave that instantly makes all programs faster. So, I'd say, your task is impossible.

        Abigail

Re: Speeding up commercial Web applications
by BrowserUk (Patriarch) on May 08, 2003 at 02:29 UTC

    Profile. See Devel::DProf. Concentrate your efforts at the bottlenecks. If you find slow bits that you can't see how to improve, post them.

    If your reading files, templates and the like from disk, create a ram drive and load them from there.

    NB: This is highly speculative! If there are truely lots of globals in use, you might get a little from making them lexicals by declaring them at the top of each file using our, assuming your using a version of perl that supports this.

    Beyond that, your the man who can see the code. What's it doing that takes so long? PROFILE:)


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      you might get a little from making them lexicals by declaring them at the top of each file using our

      It my understanding from this doc that our creates globals. The perl561delta.pod document mentions that they "... can be best understood as a lexically scoped symbolic alias to a global variable" (which I must confess did not assist my understanding one iota :-) ). But 'our' variables do create symbol table entries with the associated typeglob memory footprint overheads.

        Hence the speculation.

        It was pointed out to me that, in some circumstances, that using lexicals rather than globals is quicker. One explaination I saw, but cannot now find, is that it is quicker to find a lexical than a global? I took this to be something to do with the fact that the compiler has to know where a lexical is at compile time, and effectively hard codes the 'location' into the optree, but that globals are 'found' at each time, at runtime. The technical detail my be wrong but this simplified imagery "works for me" ... until a better explanation is available.

        How far this case extends I have never tried to assertain as I have found very few case where I use globals, and in the few cases I do, they are never citical to performance, but the the OP seemed to be looking for 'simple' measures that might help. Adding use strict at the top of a module will rapidly spit out the names of the globals. Adding one line, our ($a, $b, $this, $that); with an appropriate comment seem a simple enough change to at least make it worth trying, given the lack of other information and the restrictions imposed by the OP's question.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      I have profiled, more with timestamping than anything else. Sorry for not stating this in my original writeup.

      I guess this is more a question of what to do if you have profiled, proven that the application is at fault, and probably won't be able to run it in mod_perl. Anything but throw more powerful processors at it or buy/write a new application?

      Thanks for your input, BrowserUk and Abigail-II.

        Sometimes, but only sometimes, small changes to the code can reap large rewards. As an example, if the application truely is Perl 4 code, then I believe (but could be wrong) that Perl 4 didn't support hashes. Assuming for the moment that is true -- I'll be quickly corrected if it isn't:) -- and (for example) the application does any sort of lookups into arrays of data using grep, then changing the array(s) being search linearly to hash(es) could have a dramatic effect without too much effort.

        I'm not really sure I understand the reluctance to modify. You said this is because you hope to upgrade to the next version sometime in the future. If you make changes now, how does that stop you upgrading? The only reason refactoring the code (ie. not changing what the code does, but only the way it does it) would impact your upgrade, is if you discovered that the later version wasn't as efficient as your modifed version. In which case you would have very strong grounds for requesting that your changes be fed back into the latest version before you took delivery. The supplier might even thankyou for it and reduce your bill (some chance:). Their customers almost certainly would thankyou.

        At the end of the day, if you change nothing, nothing will change. If you can't change the code, then you already know the other options. More memory, a faster processor, harddisks etc.

        I think you already knew this though, so it begs the question, what were you hoping for?

        Oh! You have already done the Monk's ritual haven't you?


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller

        Well, your major concern seems to be avoiding large changes to ensure that you can hope to upgrade - have you tried talking to the vendor, to see if either they are already working on speed improvements, or would consider rolling your changes into any revision of the product?

        Hugo
Re: Speeding up commercial Web applications
by arturo (Vicar) on May 08, 2003 at 02:45 UTC

    I doubt you'd see a huge speedup by hacking the code to eliminate globals and uses of local where my captures what was intended. I also doubt that there would be such a thing as a smooth upgrade from this product to one that adheres more to the Perl5 or (by the time it comes out?) Perl6 way of doing things that didn't simply preserve or transform the underlying data. The code involved in an upgrade is likely to be all new, assuming it provides performance boost of the order you're seeking. As to the mod_perl approaches, they won't likely help unless starting the Perl interpreter is the problem (if it takes 1 second to start the interpreter, and 8 seconds to gather and organize the data, and another second to generate a page now, under PerlRun it might go down to .2 seconds startup time, 8 seconds to collate, and a second to generate the page 1. The best thing you could do is profile your app to see where it's spending the most time to get an idea of where you might start optimizing. There is no point spending several programmer-weeks optimizing a little routine that accounts for .02% of the time the script runs (be sure to point out that time you spend on this is time not spent on other stuff -- that must be taken into account when weighing any of the options here). Search for "profiling" on CPAN; you might start by checking out Devel::DProf although I'm sure there are profiling gurus around here who can help you make a good choice.

    Best, and this is hard to say much about in a vacuum, is to look for places where it's manifestly inefficient, although this is likely to be hard to track down. If you can, get the system to cache frequently accessed or hard to calculate data / generated pages. If it's doing the same beast of a computation for each request, and the results of that computation change relatively slowly and are valid for a wide range of users, then a cache will really speed it up. Maybe your solution could be as simple as putting the web application behind a caching proxy, once you figure out how to cache to best match your desire for up-to-date personalized data against performance concerns.

    As to inefficient algorithms, a really obvious case would be a subroutine that searches an array that's called a lot which looks like

    # ok, yeah, this is perl5ey sub in_array { my $foo = shift; return grep $_ eq $foo, @global_array; }
    but I think it's unlikely you'll find too many places where (a) such a thing exists and (b) such that you could fix it without really modifying the code.

    In the end, the thing to do may be to upgrade the software and spend your time figuring out how to migrate the data.

    1 The actual numbers are bogus, but the point is I hope clear enough.
Re: Speeding up commercial Web applications
by toma (Vicar) on May 08, 2003 at 07:42 UTC
    Many of the CGI applications I have worked on spend a *lot* of time on print statements. On my machine, this code:
    my $out=''; for (0..1000) { $out .= "$_"." bottles of beer on the wall\n"; } print $out;
    is about six times faster than:
    for (0..1000) { print "$_"," bottles of beer on the wall\n"; }
    Changing all the code to look this way is probably too much trouble. So here's an idea that is only three times faster than print:
    package Print; require Exporter; @ISA = qw(Exporter); @EXPORT_OK = qw(Print); our $sout=''; sub Print { $sout .= "@_"; } END { print $sout; } 1;
    Then change your code to call Print instead of print. This is a small change that can be applied in a batch fashion to all of your slow code to speed it up.
    use Print qw (Print); for (0..1000) { Print "$_"," bottles of beer on the wall\n"; }
    This hasn't been through a lot of testing, I just wrote it.

    It is good practice to create some output for the user to look at early in your program, then buffer up the rest of the output that is slow to generate.

    It should work perfectly the first time! - toma

Re: Speeding up commercial Web applications
by Jaap (Curate) on May 08, 2003 at 10:37 UTC
    Reading between the lines, it looks like you want to hear: "This is impossible. Go to your boss and tell him it can't be done."

    But back on topic. Perhaps you should ignore the fact that you may ever want to upgrade.
    Just try to speed it up as much as possible and keep track of your modifications. At the end, you can choose a couple of optimisations that cover 80% of the speed increase and apply the same techniques on possible upgrades.
Re: Speeding up commercial Web applications
by Molt (Chaplain) on May 08, 2003 at 12:02 UTC

    I'd personally say go for it and make sweeping changed to get the speed out. If, as you say, the application feels like it was written for Perl4 then hoepfully any major future upgrade will involve a general update, and so won't need your changes done since it'll be fast enough as is.

Re: Speeding up commercial Web applications
by mattr (Curate) on May 08, 2003 at 10:56 UTC
    If the most time is spent on processing and we have no idea what kind of processing you are doing it is not easy to give you pointers. Perhaps you could put some code or pseudocode online. If not perhaps as someone else mentioned you could take advantage of the database being compiled not interpretive.

    If you can arrange for your "processing" to be done at a db level you may be able to simplify the amount of perl-based processing you do. For example you might be able to use a temporary table to do things. Or as mentioned above grouping will be faster in the db.

    Can you really not provide any finer resolution on the amount of time taken in the subprocesses of the "processing" step? You should be profiling everything otherwise how are you going to find where the bottleneck is in the code?

Re: Speeding up commercial Web applications
by AssFace (Pilgrim) on May 08, 2003 at 16:40 UTC
    Also make note of how much of your code really needs to be dynamic. I don't know what your app does - but if the data that it is yanking from the DB isn't changing that much over time, then you can create static pages that will obviously be far faster and put less of a load on the server.

    This is worthless in some environments where every query is different, or where the data is changing every 2 seconds.

    But if by chance your app is something where the data is slow to change over time, you can have scripts create static pages every morning when the load on the server is slow - then let the users navigate the static pages.
    Good if your users come in to look at data briefly and rarely (once a day and the data hasn't changed since yesterday).
    Lousy if users are using the app all day long, and every page that they load involves more data being changed and requiring more updates.


    Of all the web apps that I've worked on and/or written, there are some that looking back on it, we could/should have done that - and then there are many on the other hand that can't ever do that b/c it just won't fit in the environment.
    I'm curious as to what your app does

    -------------------------------------------------------------------
    There are some odd things afoot now, in the Villa Straylight.
      I did not disclose the problem domain because I did not want to harm the company we purchased the product from.

      Thanks for the suggestion on serving static pages. It's an angle I hadn't really considered before. We could probably come to a compromise on the frequency of page updates, like once every 30 minutes.

      Update: Clarified wording about the problem domain.

Re: Speeding up commercial Web applications
by dragonchild (Archbishop) on May 08, 2003 at 15:11 UTC
    Write a better version, then steal their business. Oh, and demand a refund cause their code stinks.

    I agree with Jaap - tell your boss that you can write something better in 3 months, which would be the amount of time you'd take to speed it up and deal with the first upgrade. Heck, you could upgrade it, then sell your upgrades back to them!

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: Speeding up commercial Web applications
by matsmats (Monk) on May 09, 2003 at 12:27 UTC
    You state that you have profiled already, but I'm surprised no one has mentioned SmallProf. It's really handy to locate those regexps and suchs that should be outside loops but instead drags everything down by being inside them.

    Mats

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://256443]
Approved by Thelonius
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (6)
As of 2024-04-19 15:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found