in reply to Multiple Multiline Regexps?
This might get you started:
#! /usr/bin/perl use strict ; use warnings ; $|++ ; my $data = qq{ [...] <body> [...random stuff...] <li>headline one</li> <br> <p>the story</p> [...random stuff...] <li>headline two</li> <br> <p>the next story</p> [...random stuff...] <body> } ; while ( $data =~ s{<li>(.*?)</li>.*?<p>(.*?)</p>}{}s ) { print "Headline: $1\nStory: $2\n\n" ; } __END__
That is, of course, assuming that the only use for <li> and <p> are only used for headlines and stories. IMO, the more restrictive you can make this regexp, the better.
Update: This is probably better done with a proper parser. I've never used it, but HTML::Parser might be a good option.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Multiple Multiline Regexps?
by Bird (Pilgrim) on Jul 25, 2002 at 18:57 UTC |