in reply to getting the txt of terminal nodes only XML::TWIG

It would be really useful to see some of your code to see how you are approaching this problem, particularly as I believe this could be a very, very easy problem to fix. In the section of your code where you are printing node content, you can restrict the node content printed to only that of terminal nodes by including a conditional similar to the following:

print $node->trimmed_text, "\n" unless $node->children_count;

This code employs the children_count method of the element object, which returns a the number of child nodes of the given node, as the conditional for output. See the XML::Twig documentation for further information.

 

perl -le 'print+unpack("N",pack("B32","00000000000000000000001001010101"))'

Replies are listed 'Best First'.
Re: Re: getting the txt of terminal nodes only XML::TWIG
by Isanchez (Acolyte) on May 01, 2003 at 19:17 UTC
    Rob, thanks a lot for the response, this is some code that actually gets the text out of terminal nodes only. But, it prints the text of sister nodes together which makes it non usable. Can you please lead me to solve this problem? thanks a lot.

    a sample input file with the message I sent is below. As you can see when you run it, you get the text of the sister nodes ...thanks printed together with reputation. I tried your line of code, but I can't make it work. thanks again, Ivo

    #!/bin/perl -w use strict; use XML::Twig; my $twig= XML::Twig->new(); my $file = "message.xml"; $twig ->parsefile( $file ); my $root = $twig->root; my @nodes = $root ->children; foreach my $node ( @nodes) { my $content = $node->text ; print "$content "; }
    OUtput:

    perlquestion Isanchez Hi, I have to recursively go over xml files (that look very differently from each other) in a folder and collect every content for every tag. I have tried with TWig code but it doesn't work because it grabs all tags including parent tags and then prints first the contents that belong to the parents i.e. all and then the contents again but this time for each doughter node. Can any wise monk give some idea of what to do ? thanks,"reputation"

    edited: Thu May 1 21:28:03 2003 by jeffa - code tags, formatting