in reply to Regex on HTML across multiple lines with WWW::Mechanize->content()
Have you looked at perlre and what it says about newlines?
Personally, I don't hand-parse HTML using regular expressions anymore. I use HTML::TreeBuilder::XPath together with HTML::Selector::XPath. There also is a TreeBuilder plugin, WWW::Mechanize::TreeBuilder. A CSS selector for your example could be
td.Label + td
so the code for finding the relevant node(s) would be:
my $query = selector_to_xpath('td.Label + td'); my @nodes = $mech->content->findnodes($query); for (@nodes) { print $_->as_HTML; };
|
|---|