in reply to Using parent and child processes in perl

(70007)The timeout specified has expired: ap_content_length_filter: apr_bucket_read() failed, referer: <website-address>

That particular error tends to come up for long-running CGI requests that also happen to return a ton of data. I don't know to what extent each condition may contribute, as they are pretty highly correlated in the wild. Still, I think I can help a little:

I can use a parent process to temporarily tell the user that the result is being fetched and a child process to carry out the actual result computation.

How long does the actual computation take? (Can you run the same thing on the command line?) How much information is returned (and how much is sent to the browser in the initial connection)?

The de-facto standard way of handling this sort of thing used to be to do something like this (at least on POSIX-ish systems, although much of this can be simulated on Win32 and others):

use File::Temp qw/tempfile/; use CGI; my $q = new CGI; # Get temp file before we fork my (undef, $tempname) = tempfile(OPEN => 0); my $pid = fork; if (not defined $pid) { print $q->header(-status=>'500 Server cutlery malfunction'); exit; } if ($pid) { # Parent. Redirect to display request. # XXX WARNING XXX - This implementation is # very insecure: better off creating a random # string and keying that in your database. print $q->redirect($BASE.'?display_id=' . $tempname); exit; } # ELSE, we're in the child. Run the long-lived command, # saving the result in $tempname. Do not attempt to # send any information directly to the browser.

Then, you have another CGI (or use a CGI parameter, if you prefer to roll it into the same one), that does something like this:

my $display_id = $q->param('display_id') // my_fatal_error_handler(); # Error if $display_id is undef open my $fh, '<', $display_id or my_fatal_error_handler(); print $q->header; print '<html><head><meta http-equiv="refresh" content="5"></head><body +>'; # Page output here # Do not include the <meta...refresh > tag once you # have detected the command is finished, or redirect # to another page/script close $fh; unlink $display_id;

That's the basic template, anyway, and is easy to implement with pure HTML. You can either dump the contents of $tempfile directly, or store a serialized copy of the results there instead. If your implementation is heavily database-driven already, it may be preferable to store the intermediate result in a temporary/memory table instead of a file. If you already have session ID mechanics built in to your CGI app, use them.

More advanced/modern systems use Javascript and usually something like JSON to display the page, and then have the browser make JSON requests and stick the updated text right in the DOM, so that a complete page refresh isn't necessary. That's a bit beyond the scope for SoPW, but there are many sites out there that will explain it better than most of us can. The point is, the server-side logic is actually quite similar; you're just sending JSON data back instead of HTML, and you might set it up to send only new/changed data to save time/bandwidth.

Replies are listed 'Best First'.
Re^2: Using parent and child processes in perl
by Anonymous Monk on Jul 14, 2013 at 11:32 UTC

    The actual computation takes about 4-5 minutes and I am using a RedHat Linux OS. I return some 1000 rows of data. Some 1000 rows of "The quick brown fox jumped over a lazy dog, and the lazy dog did nothing but bark", you can assume. So you are saying, in the child I store the result in a global variable($temp) and in the parent,I will have only one line that redirects the url to another cgi script with this $temp as parameter. i.e

    my $temp; # Global variable if($pid) { print redirect(); exit; } else { # call the computation function in child }


    But how will I know when $temp is populated completely? :(

      The actual computation takes about 4-5 minutes and I am using a RedHat Linux OS. I return some 1000 rows of data.

      Unless the rows are huge, 1000 rows is not a lot of data, so you should be OK, there. The original issue is almost certainly the computation time. Redhat definitely supports fork, which is good.

      So you are saying, in the child I store the result in a global variable($temp) and in the parent,I will have only one line that redirects the url to another cgi script with this $temp as parameter. i.e

      No. Once the fork() takes place, the parent and child are separate processes. Updates to $temp (or any other memory, for that matter) in one process will not affect the other process whatsoever. Then, the parent just prints the redirect header with the display_id and exits immediately. In five seconds, when the browser sends a request for the display_id, the server creates another entirely new process to service the request.

      That's why I suggested a temporary file (or temporary database storage); you need some way to keep persistent state (and know where to find that state), because the client connection is created and destroyed every few seconds when the refresh hits. It's a simple form of IPC.

      But how will I know when $temp is populated completely? :(

      That's up to you, but one simple method is to designate an "end of transmission" (EOT) marker that will never appear in the normal output. In the child, print the EOT to the end of the same temp file once the operation finishes. In the other CGI that reads the display_id file, if you see that EOT, you know the job has finished, and can take whatever action is appropriate.

        Thanks a lot! One last question, instead of redirecting, I would also be able to process the content of the $tempfile inside the parent itself, won't i? Once again, thanks a bunch!