Re: A line of code matches the question

Hello *2, and welcome to the Monastery!

First, please note that the /g modifier on the first regex (the one in the if statement) does nothing, because the regex is called only once, in scalar context. If there were two or more <div id="724"> elements, only the first would be printed. You can fix this easily by changing the if into a while loop:

while ($t =~ /<div id="724">(.*?)<\/div>/sg)
{
    print "$_\n" for $1 =~ /<p>(.+?)<\/p>/g;
}
[download]

However, as SuicideJunkie says, you’ll be much better off using a dedicated XML parser. But note that your XML is not well-formed, because the <meta charset="UTF-8"> tag has no corresponding closing tag. When this is fixed, parsing is straightforward:

use strict;
use warnings;
use XML::LibXML;

my $t = <<'EOF';
...
<meta charset="UTF-8" />
...
EOF

my $dom = XML::LibXML->load_xml(string => $t);

print $_->to_literal . "\n" for $dom->findnodes('//div[@id="724"]/p');
[download]

Output:

 1:59 >perl 1798_SoPW.pl
aaa22
22
22
aaa22
aaa22
aafsdfsdfa22

 1:59 >
[download]

Hope that helps,

Athanasius <°(((>< contra mundum Iustus alius egestas vitae, eros Piratica,

Comment on Re: A line of code matches the question Select or Download Code

Replies are listed 'Best First'.
Re^2: A line of code matches the question by *2 (Novice) on Aug 10, 2017 at 17:16 UTC
I have just done some testing, I found that XML :: LibXML is too concerned about the HTML format is correct, it does not seem to allow me to make a mistake. I found it was not quite suitable for doing this thing, and maybe the regular expression was more suitable for my current job. :)	[reply]
Re^2: A line of code matches the question by *2 (Novice) on Aug 10, 2017 at 16:55 UTC
Wow, XML :: LibXML too strong! It solved my problem, the other of your careful worthy of my learning!	[reply]