confused_programmer has asked for the wisdom of the Perl Monks concerning the following question:

hey monks im having some trouble processing websites in parallel when the link redirects.. have a look at my code
#!/opt/csw/bin/perl -w print "Content-type: text/html\n\n"; require LWP::Parallel::UserAgent; use HTTP::Request; $timeout = 10; my @ns_headers = ( 'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)', 'Accept' => 'text/html, text/plain,', 'Accept-Charset' => 'iso-8859-1,*,utf-8', 'Accept-Language' => 'en-US', ); my $reqs = [ HTTP::Request->new('GET', "http://www.google.ca", [@ns_headers]), HTTP::Request->new('GET', "http://advertising.microsoft.com/micros +oft-adcenter?s_int=US_20070401_livesearchResults_smh_001", [@ns_heade +rs]), ####UN-COMMENT THIS and it wont work ###### HTTP::Request->new('GET', +"http://advertising.microsoft.com/search/", [@ns_headers]), ]; my $pua = LWP::Parallel::UserAgent->new(); $pua->in_order (1); # handle requests in order of registration $pua->duplicates(0); # ignore duplicates $pua->timeout (2); # in seconds $pua->redirect (1); # follow redirects foreach my $req (@$reqs) { print "Registering '".$req->url."'<BR>\n"; if ( my $res = $pua->register ($req) ) { print STDERR $res->error_as_HTML; } } my $entries = $pua->wait(); foreach (keys %$entries) { my $res = $entries->{$_}->response; print "Answer for '",$res->request->url, "' was \t", $res->code,": + ", $res->message,"<BR>\n"; print "and the content is<BR>\n"; print "---------------------------<BR><TEXTAREA NAME='asdasd' ROWS +='5' COLS='80' WRAP='virtual'>\n"; print $res->content; print "</TEXTAREA>---------------------------<BR>\n"; }

by the way this code was taken from a website as an example here is where i found it http://search.cpan.org/~marclang/ParallelUserAgent-2.57/lib/LWP/Parallel.pm
update:
i'm new to perl, i have java knowledge and am applying it as best as i can to perl, i have no idea how to get any error messages. what's going wrong i think is the line: my $entries = $pua->wait(); i think that may be the problem, because nothing executes after that. if i input valid links in the requests the program runs fine and just prints out the content of the website. if i input a link that redirects to another link, then my program freezes up. thats about as best as i can describe my problem. like i said this code isn't fully mine, and im having trouble understanding it

Replies are listed 'Best First'.
Re: trouble with redirecting
by kyle (Abbot) on Jun 05, 2007 at 19:28 UTC

    I installed LWP::Parallel::UserAgent and ran your code with and without the commented line. It worked both times.

    I modified the output loop so it would just display the content length instead of the whole page content, and I noticed that the offending request actually redirects to one of the other requests in the list, so I wondered if the module is pruning duplicates. I set your $pua->duplicates() call to both 0 and 1, and the only difference it made was that the one request came back "302: Found" instead of "200: OK". I also tried changing the timeout value to a few different things (0, 1, 2, 10), and that never made a difference either.

    I recommend you look at your network. Since this is an advertising URL that you're hitting, I wonder if a proxy is helpfully filtering it for you (whether you explicitly go through a proxy or not).

    As an aside, I also recommend use instead of require, and I recommend you use strict and use warnings.

Re: trouble with redirecting
by shmem (Chancellor) on Jun 05, 2007 at 18:21 UTC
    im having some trouble processing websites in parallel when the link redirects

    What trouble do you have? See I know what I mean. Why don't you?

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      and by doesnt work i mean, the whole program doesn't execute it freezes somewhere
      well in my code there is a comment that says, if you uncomment this it doesnt work.. when it should work. that is the trouble.
        So we have to search your program instead of you providing the information. Then, "it doesn't work" is such a lousy description of the problem that the best answer is "well, then make it work."

        Is it so difficult to provide basic information? Have you actually read I know what I mean. Why don't you?

        What is going wrong? what are the symptoms? Do you get any error message? could you post that?

        No, I won't run your code just to find out.

        update: GrandFather already gave you the same advice as a reply to your first posting. Why don't you just do it and heed his advice?