Whitey has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to pull HTML from another site and write a selected portion of it to a text file on my server. I cannot get this to work. Could you please help me with this? Thanks a million. Whitey
#!/usr/bin/perl use LWP::Simple; use LWP::UserAgent; $_ = get ("http://www.bloomberg.com/energy/index.html"); $data =~ /<!---------------PETROLEUM-----------------> (.*) <map name= +"BbgELogin2">/m; open (FH,"./file.txt") ||die"$!"; print FH $_; close FH;

Replies are listed 'Best First'.
(jeffa) Re: Pulling HTML off another site problem
by jeffa (Bishop) on Jun 23, 2001 at 22:29 UTC
    UPDATED, added sample code instead of just pointing to CPAN.

    First pointer, use strict! Then you see that you really want:

    my $data = get . . . . $data =~ . . . .
    instead of assigning the output of get() to $_.

    Here, try this:

    #!/usr/bin/perl -w use strict; use LWP::Simple; my $data = get ("http://www.bloomberg.com/energy/index.html"); my ($wanted) = $data =~ /<!-+PETROLEUM-+>\s*(.*)\s*<map\s+name="BbgELo +gin2">/s; open (FH,'>file.txt') || die $!; # > creates a new file, >> appends print FH $wanted; close FH; # not really necessary in this simple script
    $wanted should have what you want. Use \s instead of a litteral space. \s catches newlines and tabs as well. Also, you need the 's' modifier instead of 'm'.

    I recommend you use a parser, such as HTML::Parser, or possibly HTML::TokeParser. It takes a little time to learn the interface to these modules, but that time is well invested, as you will ultimately save more time and hair.

    Jeff

    R-R-R--R-R-R--R-R-R--R-R-R--R-R-R--
    L-L--L-L--L-L--L-L--L-L--L-L--L-L--
    

      Actually, $wanted contains 1 or 0, depending if it matched or not... but adding parentheses thusly

      my ($wanted) = $data =~ /(<!-+PETROLEUM-+>\s*(.*)\s*<map\s+name="BbgE +Login2">)/s;
      will fix that.

      Update: ChemBoy stupid. ChemBoy not have coffee. Bad ChemBoy! (thanks, cLive ;-); sorry, jeffa!)



      If God had meant us to fly, he would *never* have give us the railroads.
          --Michael Flanders

        Errr,

        Jeff did include parentheses, in the middle.

        cLive ;-)