sulfericacid has asked for the wisdom of the Perl Monks concerning the following question:
Code so far:
#!/usr/bin/perl use LWP::Simple; use strict; $|=1; my $url = "http://sulfericacid.perlmonk.org"; my $altavista = "http://www.altavista.com/web/results?q=link:$url&kl=X +X&search=Search"; my $google = "http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1& +q=link%3A$url&btnG=Google+Search"; ######################## # Altavista! ######################## my $altavista_content = get("$altavista"); my @altavista_lines = split /\n/, $altavista_content; my $altavista_results; foreach my $altavista_line (@altavista_lines) { $altavista_results = $1 if $altavista_line =~ m/AltaVista found (.*) r +esults/; } print "searched: $altavista\n"; print "results: $altavista_results\n"; ######################## # Google! ######################## my $google_content = get("$google", 'User-Agent' => 'Mozilla/4.76 [en +] (win-98; U)'); my @google_lines = split /\n/, $google_content; my $google_results; my $hits; foreach my $google_line (@google_lines) { if ($google_line =~ /Results <b>\d+<\/b> - <b>\d+<\/b> of about <b>((\ +d{1,3}\,?)+)<\/b>/g) { $hits = $1; }} #Results <b>1</b> - <b>1</b> of <b>1</b>. print "searched: $google\n"; print "results: $google_results $hits\n";
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Speeding up HTML parsing
by biosysadmin (Deacon) on Apr 21, 2004 at 02:45 UTC | |
|
Re: Speeding up HTML parsing
by TilRMan (Friar) on Apr 21, 2004 at 03:13 UTC | |
|
Re: Speeding up HTML parsing
by asdfgroup (Beadle) on Apr 21, 2004 at 10:29 UTC | |
|
Re: Speeding up HTML parsing
by Fletch (Bishop) on Apr 21, 2004 at 03:10 UTC | |
by sulfericacid (Deacon) on Apr 21, 2004 at 03:13 UTC | |
by Fletch (Bishop) on Apr 21, 2004 at 12:05 UTC | |
|
Re: Speeding up HTML parsing
by asdfgroup (Beadle) on Apr 21, 2004 at 10:41 UTC |