traxlog has asked for the wisdom of the Perl Monks concerning the following question:

Oh great ones,

Further to a previous post of mine, What can cause the delay of a cgi script? it turns out that humble print statements are the cause, and now I'd like to find out why and what I can do about it!

For the life of me I cannot track down what can cause the print statements in this:
# lock DB # exract data and load into %content # unlock DB print header; # from CGI module print $content{'menu'} . $content{'body'} . $content{'footer'}; exit;

to sometimes take between 2 and 300 seconds (Apache timeout) to execute.

Some background:
a) the total output is between 20kb and 75kb (nothing excess here)
b) program runs under one process - no kids
c) has occurred on 2 different webservers running Apache 1.3.33 Unix
d) DB's are already closed and no further disc access is asked
e) $| = 0; and $| = 1; make no difference
f) Can happen to a Googlebot just as easily as Explorer, although I've never been able to recreate it from (macosx) Safari, Explorer or Firefox on adsl connection
g) Both webservers had more than enough capacity and RAM to process the requests and were relatively quiet at the times of the intermittent delays
h) nothing in the Apache error logs
i) Perl 5.8.0 and 5.6.1

Has anyone experienced anything like this before?
Any suggestions as to how to avoid it?

Excerpt from my log:
Please note that timings are from Time::HiRes and the print statements occur between start_display_layout and end_display_layout in this case taking 8.9302 secs to execute

27-06-2005 11:23:22 (13747) xxx.xxx.xxx.xxx /index.pl exec->9.0591 day=, month=, year=2005, style=0, price=0, partorbed=1, p=search, zoek +=Start zoeken CONTENT_LENGTH -> 76 CONTENT_TYPE -> application/x-www-form-urlencoded DOCUMENT_ROOT -> /home/mysite/HTML GATEWAY_INTERFACE -> CGI/1.1 HTTP_ACCEPT -> image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, ap +plication/vnd.ms-excel, application/vnd.ms-powerpoint, application/ms +word, application/x-shockwave-flash, */* HTTP_ACCEPT_ENCODING -> gzip, deflate HTTP_ACCEPT_LANGUAGE -> nl HTTP_CACHE_CONTROL -> no-cache HTTP_CONNECTION -> Keep-Alive HTTP_HOST -> www.mysite.com HTTP_USER_AGENT -> Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) HTTP_WEFERER -> AIJJPHSCRTNHGSWZIDR PATH -> /sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin QUERY_STRING -> REMOTE_ADDR -> xxx.xxx.xxx.xxx REMOTE_PORT -> 3128 REQUEST_METHOD -> POST REQUEST_URI -> /index.pl SCRIPT_FILENAME -> /home/mysite/HTML/index.pl SCRIPT_NAME -> /index.pl SERVER_ADDR -> xxx.xxx.xxx.xxx SERVER_ADMIN -> webmaster@mysite.com SERVER_NAME -> www.mysite.com SERVER_PORT -> 80 SERVER_PROTOCOL -> HTTP/1.1 SERVER_SIGNATURE -> <ADDRESS>Apache/1.3.33 Server at www.mysite.com Po +rt 80</ADDRESS> SERVER_SOFTWARE -> Apache/1.3.33 (Unix) PHP/4.3.9 load_modules 0.0000 / 0.0000 get_lock 0.0745 / 0.0745 lock_granted 0.0004 / 0.0749 release_lock(start) 0.0419 / 0.1168 finished_undef 0.0001 / 0.1169 finished_untie 0.0117 / 0.1286 finished_LOCK_UN 0.0001 / 0.1287 closed_DBLOCK 0.0000 / 0.1288 start_display_layout 0.0000 / 0.1288 end_display_layout 8.9302 / 9.0590 TOTAL------------> 9.0591 / 9.0591

Replies are listed 'Best First'.
Re: CGI print statements taking forever
by jhourcle (Prior) on Jun 27, 2005 at 11:04 UTC

    In your example, you're needlessly concatinating the string, before you print it. You should see a slight speed up from changing periods to commas:

    print $content{'menu'}, $content{'body'}, $content{'footer'};
      Yep, tried that, and I've also tried :
      print $content{'menu'}; print $content{'body'}; print $content{'footer'};
      and
      print <<HTML; $content{'menu'} $content{'body'} $content{'footer'} HTML
      All solutions only shaving milliseconds off the total time.

        If you've gotten it down to that particular block of code, can you manage to break it down further, and track down exactly which line is the worst offendor?

        The only times I've seen print be a major problem was when I was passing around rather large strings through a series of functions (by 'rather large', they were in excess of a 500k), concatinating as it went through the functions.) -- so it wasn't really the print, it was the concatination, and assigning large strings based on the return of the functions. We changed the logic so we were passing around an array of strings, and printing that, and it shaved a few seconds off our time.

        Also, be careful of the "<<HTML" example that you've given -- it would also insert extra line returns (probably not an issue for HTML, but could be an issue for other situations

Re: CGI print statements taking forever
by anonymized user 468275 (Curate) on Jun 27, 2005 at 12:45 UTC
    Unless evil gremlins abound (which would also make logical analysis pointless) it seems inconceivable that the print can in itself cause that much variation in performance. I would be inclined to look elsewhere, such as network issues. If it weren't for your timer evidence I'd have suspected multiple queueing for that db lock and even now it sort-of galls me to have to eliminate that - it seems oh-so credible as a suspect cause, given the "arithmetic" variation in performance measurements you report that suggest a per user cumulative delay - way tempting!

    One world, one people

      Those evil gremlins are starting to look pretty suspicious!
      DB lock is of course now completely out of the question as is poor disk performance. That I isolated weeks ago.

      I don't know about the mechanics of the interaction between Perl and Apache, but is certainly perplexing why (given that it is a network issue) a client could hold back the successful execution of a cgi script. The internet would effectively grind to a halt if that were the case.

      I've read that Apache no longer buffers perl output since V1.2, leaving the Perl buffer as the only bottleneck, and like I said, $| = 1; and $| = 0; make no difference!

        Based on your additional info., my money is on it being the internet itself -- if output cannot be delivered even momentarily to the other end, you get a pause signal (used to be CTRL-S when I last looked into such things) coming back at some level which prevents the host continuing transmission until it gets a continue signal (used to be CTRL-Q the last time I looked). Such interaction takes place at a lower network layer than the Apache server is hooked into, but it might be possible to detect using some network administration tools. Or perhaps you could see if there is a correlation between distance to the user IP and print processing time?

        One world, one people