Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I want to open up a webpage and grab it contents and parse through the information to grab a particular line I am looking for. Please direct me to a module that might help me achieve this.

Thanks

Replies are listed 'Best First'.
Re: opening up a webpage
by jZed (Prior) on Mar 16, 2005 at 23:01 UTC
    LWP will help you get the page, HTML::Parser or another module in the HTML::* hierarchy will help you parse it.
Re: opening up a webpage
by Tanktalus (Canon) on Mar 16, 2005 at 23:01 UTC

    LWP::Simple and HTML::Parser? In the future, you may want to quickly search CPAN - you may be able to get your answer faster for this type of question.

Re: opening up a webpage
by Ovid (Cardinal) on Mar 16, 2005 at 23:14 UTC
    # entered directly into the browser. May contain typos # and assumes you have LWP::Simple installed # getting all comments to a URL use HTML::TokeParser::Simple 3.13; my $parser = HTML::TokeParser::Simple->new(url => shift); while (my $token = $parser->get_token) { print $token->as_is if $token->is_comment; }

    Cheers,
    Ovid

    New address of my CGI Course.

Re: opening up a webpage
by chas (Priest) on Mar 17, 2005 at 00:40 UTC
    perl -MLWP::Simple -e "print get(shift)" http://www...
    will actually get the page (!). Then you can use matching or other means of parsing to find a particular line; it's possible the HTML tags might be helpful in finding something, so you might not need to use one of the parser routines; then again, that might be appropriate depending on what you need.
    chas
    (Update: made command slightly shorter.)
Re: opening up a webpage
by punkish (Priest) on Mar 17, 2005 at 01:23 UTC
    Other monks have already mentioned the appropriate modules. Here is the logic for the pseudo-code
    1. Grab the entire page, usually in a scalar
    2. Extract from the above scalar the "particular" line based on the expected pattern

    A module such as LWP does precisely what is suggested in (1) above. See the cookbook for a trivial example. If your "particular" line pattern is not too tricky, (2) above should also be a trivial regexp exercise.

    Good luck.

    --

    when small people start casting long shadows, it is time to go to bed