(jeffa) Re: Pulling HTML off another site problem

UPDATED, added sample code instead of just pointing to CPAN.

First pointer, use strict! Then you see that you really want:

my $data = get . . . .
$data =~  . . . .
[download]

instead of assigning the output of get() to $_.

Here, try this:

#!/usr/bin/perl -w
    
use strict;
use LWP::Simple;

my $data = get ("http://www.bloomberg.com/energy/index.html");
my ($wanted) = $data =~ /<!-+PETROLEUM-+>\s*(.*)\s*<map\s+name="BbgELo
+gin2">/s;

open (FH,'>file.txt') || die $!; # > creates a new file, >> appends
print FH $wanted;
close FH;    # not really necessary in this simple script
[download]

$wanted should have what you want. Use \s instead of a litteral space. \s catches newlines and tabs as well. Also, you need the 's' modifier instead of 'm'.

I recommend you use a parser, such as HTML::Parser, or possibly HTML::TokeParser. It takes a little time to learn the interface to these modules, but that time is well invested, as you will ultimately save more time and hair.

Jeff

R-R-R--R-R-R--R-R-R--R-R-R--R-R-R--
L-L--L-L--L-L--L-L--L-L--L-L--L-L--

Comment on (jeffa) Re: Pulling HTML off another site problem Select or Download Code

Replies are listed 'Best First'.
Re: (jeffa) Re: Pulling HTML off another site problem by ChemBoy (Priest) on Jun 23, 2001 at 23:58 UTC
Actually, $wanted contains 1 or 0, depending if it matched or not... but adding parentheses thusly `my ($wanted) = $data =~ /(<!-+PETROLEUM-+>\s(.)\s<map\s+name="BbgE +Login2">)/s;` [download] will fix that. Update:* ChemBoy stupid. ChemBoy not have coffee. Bad ChemBoy! (thanks, cLive ;-); sorry, jeffa!) If God had meant us to fly, he would never have give us the railroads. --Michael Flanders	[reply] [d/l]
Re: (jeffa) Re: Pulling HTML off another site problem by cLive ;-) (Prior) on Jun 24, 2001 at 01:08 UTC
Errr, Jeff did include parentheses, in the middle. cLive ;-)	[reply]