bulgin has asked for the wisdom of the Perl Monks concerning the following question:

I have the following perl script which works nicely - it imports a list of urls, goes out to them, grabs the Head data and writes it to a file. Problem is, if it encounters a domain that is unresponsive or takes long to load, it halts and just sits there. I'm wondering if there is a way to make the script go to the next line if it's having difficulty connecting to the domain.

I'm also wondering if it is possible to create "threads" like the big-shot developers do? And in the extreme, this script may be looking at a list of thousands of URLs, so does anyone see any problem with this script handling that much overhead and not crashing out?

Here is the script and thank you for any help you may suggest.

#!/usr/bin/perl #print "Content-type: text/html\n\n"; use LWP::Simple; use HTML::HeadParser; open (OUTFILE, '>outfile.txt'); open (MYFILE, 'url3.txt'); foreach $line (<MYFILE>) { chomp($line); $URL = get($line); $Head = HTML::HeadParser->new; $Head->parse("$URL"); print OUTFILE $Head->header('X-Meta-Description') . "."; } close(MYFILE); close(OUTFILE); exit;

Replies are listed 'Best First'.
Re: Need time-out procedure for HTML perl procedure
by zentara (Cardinal) on Aug 16, 2010 at 16:19 UTC
    #!/usr/bin/perl use warnings; use strict; use LWP::UserAgent; use HTTP::Request::Common; my $ua = LWP::UserAgent->new; $ua->timeout(5); my $res = $ua->request(GET 'http://google.com'); if ($res->is_success) { print $res->content; }else{print 'Uh oh timeout'}

    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku

      So... not that difficult then ;)

      Just a something something...
Re: Need time-out procedure for HTML perl procedure
by marto (Cardinal) on Aug 16, 2010 at 15:50 UTC
Re: Need time-out procedure for HTML perl procedure
by BioLion (Curate) on Aug 16, 2010 at 16:00 UTC

    Alternately there is a lot of timeout modules available, might be easier to crowbar this in than converting to another LWP API? No idea, just a thought.

    Just a something something...