in reply to Re: Multi-Line Regex's
in thread Multi-Line Regex's

Of course the usual caveats about not parsing HTML with regexes still apply :)

While I can see in general that handling big chunks of html is preferably done with things like HTML::Parser, in this simple case where I'm trying to grab one paragraph which is easily delimited from a specific webpage is there any real advantage to those tools?

Replies are listed 'Best First'.
Re: Re: Re: Multi-Line Regex's
by davorg (Chancellor) on Sep 18, 2002 at 14:01 UTC

    Well, only the fact that HTML parsers will actually parse the HTML for you - whereas any regex-based solution will only handle a subset of the possible HTML and will prove extremely fragile if the HTML ever changes.

    You might like to take a look at the section "How not to parse HTML" in chapter 8 of Data Munging with Perl.

    --
    <http://www.dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg