in reply to A line of code matches the question
Hello *2, and welcome to the Monastery!
First, please note that the /g modifier on the first regex (the one in the if statement) does nothing, because the regex is called only once, in scalar context. If there were two or more <div id="724"> elements, only the first would be printed. You can fix this easily by changing the if into a while loop:
while ($t =~ /<div id="724">(.*?)<\/div>/sg) { print "$_\n" for $1 =~ /<p>(.+?)<\/p>/g; }
However, as SuicideJunkie says, you’ll be much better off using a dedicated XML parser. But note that your XML is not well-formed, because the <meta charset="UTF-8"> tag has no corresponding closing tag. When this is fixed, parsing is straightforward:
use strict; use warnings; use XML::LibXML; my $t = <<'EOF'; ... <meta charset="UTF-8" /> ... EOF my $dom = XML::LibXML->load_xml(string => $t); print $_->to_literal . "\n" for $dom->findnodes('//div[@id="724"]/p');
Output:
1:59 >perl 1798_SoPW.pl aaa22 22 22 aaa22 aaa22 aafsdfsdfa22 1:59 >
Hope that helps,
| Athanasius <°(((>< contra mundum | Iustus alius egestas vitae, eros Piratica, |
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: A line of code matches the question
by *2 (Novice) on Aug 10, 2017 at 17:16 UTC | |
|
Re^2: A line of code matches the question
by *2 (Novice) on Aug 10, 2017 at 16:55 UTC |