traxlog has asked for the wisdom of the Perl Monks concerning the following question:

Esteemed ones,
I have written a collection of cgi programs - 10 in total - all being run on one webserver. 7 for the backend (agenda, cms, administration, contract creation etc.) and 3 index.pl's for the websites that the backend drives.
All programs look to a lock mechanism as described in the camel book, (with my addition of timing out the process to catch the stragglers) and all share the same berkeley DB's.
# OPEN THE LOCK FILE my $LCK = "$lock_path/dblock.lockfile"; sysopen(DBLOCK, $LCK, O_RDONLY | O_CREAT) or die divert("Couldn't +open $LCK (lock_database_for_writing): $!"); # ATTEMPT TO LOCK THE LOCK FILE eval { local $SIG{ALRM} = sub { die "TIMEDOUT" }; alarm(30); flock(DBLOCK, LOCK_EX) or die divert("Couldn't LOCK_EX $LC +K (lock_database_for_writing): $!"); alarm(0); }; if ($@) { if( $@ =~ /TIMEDOUT/ ) { divert("Connection too busy. Error: $!"); } }

note: divert() is my output for the browser
Now, the problem is, that every now and again and for no apparent reason it does take more than the allowed 30 seconds to get a lock on the lockfile.
A runaway process I suspected? Executing the same script with the same input from my 4 browsers runs smoothly. Time after time.
Too many people trying to connect at once? No-one else trying within minutes
Particular browser?: No, several different types have the same delay. Even the same user can experience fast and slow execution times but more generally bunched with snowball effect: 2.5 secs, 7 secs, 17 secs
Perl is busy on the webserver? No, I'm the only one using perl.
Ask the ISP? No help, no idea what it could be. "Yeah but in the logs"... "Sure we'll look into it for you"...

Every run of any program writes basic info to my own log files: date, time, browser, processID and the time it took to run the whole script
( time->attempt lock->do stuff->release lock->get time).
Examining the logs only reveals that a random user can take much longer to execute the script than normal. Average script execution is 0.5 seconds - "strange" users exec time between 2.5 and 30+ seconds. What does become apparent is how many processes are being run in between executions.

I'm assuming that if you have a slow internet connection that the webserver will buffer the output of the script until the client has received the content.

My question is: What can cause one browser/user/connection to take longer to execute a script than another?

Any ideas?

Help greatly appreciated.


An excerpt from one occasion:
24-05-2005 10:33:05 -> xxx.xxx.xxx.xxx Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; FunWebProducts; SV +1; .NET CLR 1.1.4322) p=&bandID=&search_phrase= (26197) exec 16.3432 WARNING 24-05-2005 10:33:56 -> xxx.xxx.xxx.xxx, p=artist_list (26299) exec 7.8 +034 WARNING 24-05-2005 10:34:42 -> xxx.xxx.xxx.xxx, p=artist_list&style=7 (26384) +exec 8.0844 WARNING 24-05-2005 10:35:30 -> xxx.xxx.xxx.xxx, p=artist&bandID=109&style=7&l= + (26460) exec 8.2451 WARNING 24-05-2005 10:36:17 -> xxx.xxx.xxx.xxx, p=artist&bandID=109&imageID=1& +searchID=&content= (26524) exec 5.0702 WARNING 24-05-2005 10:36:36 -> xxx.xxx.xxx.xxx, p=artist&bandID=109&style=7&l= +&content=repertoire&searchID= (26547) exec 4.4575 WARNING 24-05-2005 10:37:18 -> xxx.xxx.xxx.xxx, p=artist&bandID=109&style=7&l= +&content=lineup&searchID= (26612) exec 3.4035 WARNING 24-05-2005 10:37:40 -> xxx.xxx.xxx.xxx, p=artist&bandID=109&imageID=1& +searchID=&content=lineup (26640) exec 3.3868 WARNING 24-05-2005 10:37:54 -> xxx.xxx.xxx.xxx, p=book&bandID=109&searchID= (2 +6657) exec 0.0929

Replies are listed 'Best First'.
Re: what can delay the execution of a cgi script?
by LanceDeeply (Chaplain) on May 24, 2005 at 12:49 UTC
    it's hard to guess what the problem may be without a more detailed breakdown of the timings. could you expand your logs to include the time at each phase?
    • attempt lock
    • do stuff
    • release lock
    ... and you may want to make those additional logs conditional on some environment flag.
Re: what can delay the execution of a cgi script?
by PodMaster (Abbot) on May 24, 2005 at 12:49 UTC
    What OS/Webserver/Perl?
    DB_File or BerkeleyDB?
    What is "do stuff" in time->attempt lock->do stuff->release lock->get time? Are you locking the database when you're just reading?
    What is takes the longest (attempt lock, do stuff, what, narrow it down)?
    I'm assuming that if you have a slow internet connection that the webserver will buffer the output of the script until the client has received the content.
    It probably doesn't.
    My question is: What can cause one browser/user/connection to take longer to execute a script than another?
    Either something related to hardware or software (disks are slow or you're doing too many things, infinite loop, ... or something).

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      What OS/Webserver/Perl?
      AFAIK unix, perl 5.6.1. I'd have to investigate further to find out
      Each execution reads and writes to the 13 databases (DB_File) so I've set it up to use a lockfile to prevent ANY access to the DB's unless the lock is granted.
      "Do stuff" is anything from adding events to the agenda, address book, adding content to the sites. etc
      If there were a set of calls to one of the programs that constantly take longer then it would be easy to trace. Unfortunately what one person receives in .5 seconds, another receives in 17 seconds, and then only sometimes. I can't replicate the exact conditions except for pure cgi input over my adsl connection.

      "It probably doesn't." ... buffer the output is interesting. Is there anyway of doing this? Should I prepare my HTML content ready for output, release lock for further executions and then print the HTML (since there would be no further DB access)?
      Thanks for your help.
Re: what can delay the execution of a cgi script?
by traxlog (Scribe) on Jun 06, 2005 at 15:06 UTC
    After putting the HiRes timer on every section in one of the programs I discovered that it comes down to printing a HERE document of around 56kb that causes the delay (only sometimes of course). This is about the last place I would have expected a delay.

    Has anyone experienced this phenomenon before?

    B.T.W. I have tried with $| = 1; and $| = 0; , neither making any difference to the situation.