Re: Regex to match string with numbers with possible comma
by Happy-the-monk (Canon) on Mar 17, 2004 at 15:38 UTC
|
m/AltaVista found (\d+,?\d*) results/;
should match it. See perldoc perlre for what ? and * do. Also have a look at perldoc perlretut.
Sören
Edit:
There seems to be no node for perlretut here yet. Edited to point to perldoc.com. Anyway, perlre is the better match for this question.
| [reply] [d/l] |
Re: Regex to match string with numbers with possible comma
by Roy Johnson (Monsignor) on Mar 17, 2004 at 15:55 UTC
|
Is there any reason to worry about what's between "found" and "results"? Unless it might come back with "found pig results" and you don't want to match on that, just:
=~ /found (.*) results/;
See also this node for how to match properly-formatted numbers with commas.
The PerlMonk tr/// Advocate
| [reply] [d/l] |
|
|
.* is greedy, consider the
string found 10 results blah blah found 1,000 results. Generally .* is a subpattern to be suspicious of; I would implement your idea with something
like /found (.{1,12}) results/ or /found (.*?) results/.
| [reply] [d/l] [select] |
|
|
| [reply] [d/l] |
Re: Regex to match string with numbers with possible comma
by pboin (Deacon) on Mar 17, 2004 at 15:41 UTC
|
This should do it:
=~ /Altavista found ([\d,]+) results/
| [reply] [d/l] |
|
|
#!/usr/bin/perl
use LWP::Simple;
use strict;
$|=1;
my $url = "www.tek-tips.com";
my $altavista = "http://www.altavista.com/web/results?q=url:$url&kl=XX
+&search=Search";
my $content = get("$altavista");
my @lines = split /\n/, $content;
my $results;
foreach (@lines)
{
$results = $1 if $_ =~ m/Altavista found ([\d,]+) results/;
}
print "searched: $altavista\n";
print "results: $results";
| [reply] [d/l] |
|
|
The 'v' in AltaVista needs to be capitalized, and you should be good to go. (With the exception that the commas aren't stripped.) There's an example of how to do that in this thread by Anonymous Monk Chris.
I don't know what you plan on doing, but I typically strip commas right away -- they're nothing but trouble.
| [reply] |
|
|
Re: Regex to match string with numbers with possible comma
by pboin (Deacon) on Mar 17, 2004 at 15:45 UTC
|
Actually, one of your assumptions appears to be wrong...
Altavista does not return "Altavista found 0 results", I just checked it, and the closest thing to a return message would probably be "We found 0 results." 'Altavista' should not be part of your regex.
So, you want to be sure to specifically test the zero case, a case of less than 1000, and a case of over 1000.
| [reply] |
|
|
| [reply] |
Re: Regex to match string with numbers with possible comma
by matija (Priest) on Mar 17, 2004 at 15:39 UTC
|
=~ /AltaVista found (/d+(,/d+)*) results/;
| [reply] [d/l] |
Re: Regex to match string with numbers with possible comma
by Anonymous Monk on Mar 17, 2004 at 15:49 UTC
|
=~ /AltaVista found /(\d+)/ results/;
HTH,
Chris
| [reply] [d/l] [select] |
Re: Regex to match string with numbers with possible comma
by Not_a_Number (Prior) on Mar 17, 2004 at 19:42 UTC
|
A word of warning, probably too late, but possibly important.
None of the above solutions works if AltaVista finds just one result:
AltaVista found 1 result
(No 's' on 'result'...)
dave
| [reply] |