ada4u has asked for the wisdom of the Perl Monks concerning the following question:

Regards, I'm got some issues with this little script:

#!/usr/bin/perl use XML::LibXML; use strict; my $file = $ARGV[0]; my $parser = XML::LibXML->new; my $dom = $parser->parse_file($file) or die; sub getInside{ my $node = shift; if ($node->nodeType == 1){ getInside($node->getFirstChild); } if ($node->nodeType == 3){ print "<p>" . $node->data ."</p>\n" ; } } my @titles = $dom->getElementsByTagName("ParagraphStyleRange"); foreach my $node (@titles){ if ($node->getAttributeNode("AppliedParagraphStyle")->getValue =~ +/BODY TEXT/){ getInside($node); } }

to get the text in Content tag from this xml:

<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/BODY TEXT S +/S"> <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No + character style]" Tracking="-10"> <Content>some text in this container.</Content> <Br/> </CharacterStyleRange> <CharacterStyleRange AppliedCharacterStyle="CharacterStyle/$ID/[No cha +racter style]" Tracking="-10"> <Content>this is not to be printed</Content> <Br/> </CharacterStyleRange> </ParagraphStyleRange>

any pointers will be appreciated here, tx

Replies are listed 'Best First'.
Re: missperception of XML::LibXML
by ikegami (Patriarch) on Oct 17, 2011 at 19:31 UTC
    #!/usr/bin/perl use strict; use warnings; use XML::LibXML qw( ); my $file = $ARGV[0]; my $parser = XML::LibXML->new(); my $dom = $parser->parse_file($file); my $root = $doc->documentElement(); for my $psr_node ($root->findnodes('//ParagraphStyleRange')) { if ($psr_node->getAttribute('AppliedParagraphStyle') =~ /BODY TEXT/ +) { my ($content_node) = $psr_node->findnodes('CharacterStyleRange/C +ontent') or die; say $content_node->textContent() } }
Re: missperception of XML::LibXML
by runrig (Abbot) on Oct 17, 2011 at 19:58 UTC
    Note that 'or die' is not necessary on the parse_file(). XML parsers will die if they try to parse bad XML. Also you might be able to get your desired nodes with just XPath (modifying ikegami's answer, but not tested):
    for my $node ($root->findnodes("//ParagraphStyleRange[fn:contains(@App +liedParagraphStyle, 'BODY TEXT']/CharacterStyleRange/Content")) { say $node->textContent() } }