Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

fork and kill process

by ellickjohnson (Novice)
on Apr 25, 2002 at 22:45 UTC ( [id://162129]=perlquestion: print w/replies, xml ) Need Help??

ellickjohnson has asked for the wisdom of the Perl Monks concerning the following question:

I need to run wget from a perl script and assign the output to a variabel, that I can do. The problem I have is sometimes the wget takes too long. How can I fork the wget part and then kill that process if it is still running after 60 seconds? I tired fork but wget get it's own pid which is different from the child. Anyone have any ideas on how to fork wget and kill it after 60 seconds? I also tried the timeout settings form wget - they do not work. Thanks in advance, Ellick

Replies are listed 'Best First'.
Re: fork and kill process
by rjray (Chaplain) on Apr 26, 2002 at 00:10 UTC

    Don't try to do this with fork and kill. Instead, take the path suggested in the earlier reply by rbc and use elements of the LWP package. It will take a little more work than the LWP::Simple-based example, but you can set a timeout on the user-agent object before sending the request, which is a better way of handling the 60-second limitation. You aren't doing anything else in that 60 seconds anyway, and trying to do this with fork and kill will mean setting up a $SIG{CHLD} handler, and I doubt you want to get into that mess.

    Using the LWP classes is pretty simple:

    use LWP::UserAgent; use HTTP::Request; $UA = LWP::UserAgent->new; $UA->timeout(60); $response = $UA->request(HTTP::Request->new('GET', $url); # You may now operate on $response->content(). # Be certain you test for success with $response->is_success

    --rjray

    Update: I forgot to point out initially another flaw in the fork/kill approach. By running the wget in the child and assigning it to a variable within that child, the parent would not have access to that variable unless you wrote some IPC-management to share the memory between the child and parent processes. In other words, use LWP. You'll soon see why so many of us swear by it.

Re: fork and kill process
by rbc (Curate) on Apr 25, 2002 at 23:03 UTC
    Here's a skeleton of what you might want to do ...
    die "can't fork: $!" unless defined($kidpid = fork()); if ($kidpid) { # parent sleeps 60 seconds and then kills child sleep 60; kill("TERM" => $kidpid); # send SIGTERM to child } else { # child does wget ... } print "$kidpid exitting\n"; exit;
    update
    you might want to do your wget like so ...
    ... use LWP::Simple; #so you can do a get. ... die "can't fork: $!" unless defined($kidpid = fork()); if ($kidpid) { # parent sleeps 60 seconds and then kills child sleep 60; kill("TERM" => $kidpid); # send SIGTERM to child } else { my $data = get("http://www.example.com/"); ... }
Re: fork and kill process
by asdfgroup (Beadle) on Apr 26, 2002 at 02:17 UTC
    1) You will have big difficulties if you will continue use fork because in this case you have to use any IPC (inter process communication) way to retrieve data from child (e.g. you can simply write it to file).

    2) If you still wanna to use wget you can do it this way :

    eval { local $SIG{ALRM} = sub {die "Timeout"}; alarm 60; #your timeout here $resp = `wget ....` } if ($@) { #something worng }
    3) I stricly suggest you to use LWP module like instead of wget

    4) Or ParallelUserAgent module (PUA) if you need fast simult download from several places

Re: fork and kill process
by Super Monkey (Beadle) on Apr 25, 2002 at 23:16 UTC
Re: fork and kill process
by domm (Chaplain) on Apr 26, 2002 at 11:58 UTC
    Hi!

    One problem with wget (as I have noticed, but maybe I'm doing something wrong ...) is that it somehow doesn't return the right pid to perl, even if you use wget -b

    So what I did once was to start wget and then look directly into the process table to find the right pid. Although I was doing this for exactly the opposit reason (I wanted to print a "still downloading" message to the client) it worked quite well.

    Anyway, if you found out the PID, you could then use a SIG{ALRM} to issue a kill $PID after 60 seconds

    And BTW, I don't think using LWP is the ultimate best solution for downloading stuff. If you want to download not only one page, but a grab a bunch of pages (via spidering, e.g.) wget is definitly easier than LWP. If you use the -p option, wget automatically downloads all additional files needed to render the page, i.e. external style sheets, images, javascript etc. Very handy..

    --
    #!/usr/bin/perl -w just another perl hacker
    print+seek(DATA,$=*.3,@-)?~~<DATA>:$:__DATA__
    
Re: fork and kill process
by z3d (Scribe) on Apr 26, 2002 at 14:48 UTC
    If you are intent on a forking mechanism, check out the parellel forkmanager at
    http://hacks.dlux.hu/Parallel-ForkManager/
    I've had good experience with it (mostly).



    "I have never written bad code. There are merely unanticipated features."
Re: fork and kill process
by BrotherBrett (Novice) on Apr 26, 2002 at 16:32 UTC
    Would it be just as effective to just pass wget a timeout value?

    $ret = `wget --timeout=5`;

    -Brett

Re: fork and kill process
by BrotherBrett (Novice) on Apr 26, 2002 at 16:30 UTC
    Would it be just as effective to pass wget a timeout value: --timeout=SECONDS? -Brett

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://162129]
Approved by lemming
Front-paged by rbc
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (3)
As of 2024-04-20 01:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found