in reply to Re: IPC::Open, Parallel::ForkManager, or Sockets::IO for parallelizing?
in thread IPC::Open, Parallel::ForkManager, or Sockets::IO for parallelizing?
Thanks. I've taken a closer look at LWP::Parallel now and have some questions about how it should handle many (most?) HTTPS sites. For now, it seems to return HTTP Status "503 Service Unavailable" for ones that exist and are accessible via other agents. Here is one example:
#!/usr/bin/perl use LWP::Parallel::UserAgent; use LWP::Debug qw(+); use strict; use warnings; my $headers = new HTTP::Headers( 'User-Agent' => "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 +(KHTML, like Gecko) Chrome/30.0.1599.66 Safari/537.36", ); my @requests; foreach my $url ('https://blog.arduino.cc/feed/') { push(@requests, HTTP::Request->new('GET', $url, $headers)); } # new parallel agent my $pua = LWP::Parallel::UserAgent->new(); $pua->in_order (0); $pua->duplicates(1); $pua->timeout (9); $pua->redirect (0); $pua->max_hosts (5); $pua->nonblock (0); foreach my $req (@requests) { if ( my $res = $pua->register ($req, \&handle_answer, 8192) ) { print $res->error_as_HTML; } else { print qq(ok\n); } } my $entries = $pua->wait(); foreach my $k (keys %$entries) { my $res = $entries->{$k}->response; my $url = $res->request->url; print $res->code,qq(\t $url\n); } exit(0); sub handle_answer { my($content, $response, $protocol, $entry) = @_; if (length($content)) { $response->add_content($content); } return(undef); }
As one can see with various browsers the feed in question is there but yet it is one of the feeds that LWP::Parallel is choking on.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: IPC::Open, Parallel::ForkManager, or Sockets::IO for parallelizing?
by hippo (Archbishop) on Oct 03, 2023 at 10:57 UTC | |
by mldvx4 (Friar) on Oct 03, 2023 at 12:50 UTC | |
by kcott (Archbishop) on Oct 03, 2023 at 14:02 UTC | |
by hippo (Archbishop) on Oct 03, 2023 at 14:07 UTC | |
by mldvx4 (Friar) on Oct 03, 2023 at 15:22 UTC |