in reply to Re: Making script more efficient
in thread Making script more efficient

Thank you! That is so much cleaner and nicer than our original code.

However, when I run the code now it errors out with "400 error URL must be absolute". Did I make a mistake with the URLS?

print "Enter your search query: "; my $search = <STDIN>; chomp($search); my $google_results; my @urls = qq("http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=50&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=100&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=150&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=200&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=250&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=300&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=350&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=400&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=450&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=500&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=550&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=600&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=700&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=750&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=800&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=850&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=900&sa=N", "http://www.google.com/search?q=$search&num=50&hl=en&lr= +&safe=off&start=950&sa=N", ); foreach my $url (@urls) { my $response = $ua->get( $url ); unless ($response->is_success) { print $response->status_line, $/; next; } $google_results = $response->content; &parser; } sub parser { my @links_wanted; my @links_found; my $parser = HTML::TokeParser->new( \$google_results ); while ( my $token = $parser->get_tag( 'a' ) ) { my $url = $token->[ 1 ]{ href }; next unless $url =~ m{^https?://}; push @links_found, $url; }

Replies are listed 'Best First'.
Re^3: Making script more efficient
by dragonchild (Archbishop) on May 26, 2005 at 19:35 UTC
    drop the qq from the @urls definition.

    You can also apply the same techniques to your list of URLs. Factor out the commonalities and program for the differences.


    • In general, if you think something isn't in Perl, try it out, because it usually is. :-)
    • "What is the sound of Perl? Is it not the sound of a wall that people have stopped banging their heads against?"
Re^3: Making script more efficient
by thundergnat (Deacon) on May 26, 2005 at 21:09 UTC

    If it was me, I would generate the URLS with a loop too since, there is an awful lot of duplicated code.

    my @urls; my $search = 'whatever'; for my $index(0..19){ push @urls, 'http://www.google.com/search?q='.$search.'&num=50&hl= +en&lr=&safe=off&start='.($index * 50).'&sa=N'; }