andrewheiss has asked for the wisdom of the Perl Monks concerning the following question:

I'm using XPath to parse an XML file of blog posts and comments. Here's an example of the file structure:

<entry> <title>Title of the post</title> <content>Content of the post</content> <category term="blog"/> <category term="label"/> <category term="another label"/> <category term="blah"/> ... </entry>

The first category element indicates the type of entry--whether or not the entry is a blog post or a comment. Any subsequent category elements indicate the blog post tags.

I can easily get the first instance of category like so:

foreach my $entry (reverse($xc->findnodes('//post:entry'))) { my $type = $xc->findvalue('./post:category[1]/@term', $entry); ... }

And I can get the subsequent values by referencing post:category[2]/@term, or any other number.

I can even get all subsequent values using this xpath:

my $category = $xc->findvalue('./post:category[position()>1]/@term', $entry);

Unfortunately, though, when I use position()>1 it sticks all the values in one variable, like "labelanother labelblah". What's the best way to keep the values separate? when retrieving them with xpath?

Replies are listed 'Best First'.
Re: Loop through multiple elements with same name in XPath
by ikegami (Patriarch) on May 30, 2009 at 16:58 UTC
    You need findnodes if you want to return multiple results
    for my $attr_node ( reverse $xc->findnodes('//post:entry/post:category/@term') ) { my $attr_val = $attr_node->getValue(); ... }

    Or maybe you want

    for my $entry_node ( reverse $xc->findnodes('//post:entry') ) { for my $attr_node ( $xc->findnodes('post:category/@term', $entry_n +ode) ) { my $attr_val = $attr_node->getValue(); ... } ... }
Re: Loop through multiple elements with same name in XPath
by mirod (Canon) on May 30, 2009 at 09:01 UTC

    Use findnodes and loop through the nodes. Something like (untested):

    my $category = join " ", map { $_->getNodeText } $xc->findnodes('./post:category[position()>1]/@term', $entry);

    updated:really use findnodes (thanks ikegami)

Re: Loop through multiple elements with same name in XPath
by Anonymous Monk on May 30, 2009 at 09:11 UTC
Re: Loop through multiple elements with same name in XPath
by Jenda (Abbot) on May 30, 2009 at 20:51 UTC

    I know I'm getting tiresome ...

    my $parser = XML::Rules->new( rules => { _default => 'content', category => 'content array', entry => 'array no content', post => 'pass no content', } ); my $data = $parser->parse($the_xml); foreach my $entry (@$data) { print "Title: $entry->{title}\nContent: $entry->{content}\n"; if ($entry->{category)[0] eq 'blog') { print "It's a blog post\n\n"; } ...
    or
    my $parser = XML::Rules->new( rules => { _default => 'content', category => 'content array', entry => sub { my ($tag,$attr) = @_; delete $attr->{_content}; $attr->{type} = shift(@{$attr->{category}}); return '@entry' => $attr; }, post => 'pass no content', } ); my $data = $parser->parse($the_xml); foreach my $entry (@$data) { print "Title: $entry->{title}\nContent: $entry->{content}\n"; if ($entry->{type) eq 'blog') { print "It's a blog post with labels: ".join(', ', @{$entry->{categ +ory}})."\n\n"; } ...
    or if you like you can handle each the <entry> within the sub specified in the rules and forget the value then and save memory.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.