Loop through multiple elements with same name in XPath

andrewheiss has asked for the wisdom of the Perl Monks concerning the following question:

I'm using XPath to parse an XML file of blog posts and comments. Here's an example of the file structure:

<entry>
  <title>Title of the post</title>
  <content>Content of the post</content>
  <category term="blog"/>
  <category term="label"/>
  <category term="another label"/>
  <category term="blah"/>
  ...
</entry>
[download]

The first category element indicates the type of entry--whether or not the entry is a blog post or a comment. Any subsequent category elements indicate the blog post tags.

I can easily get the first instance of category like so:

foreach my $entry (reverse($xc->findnodes('//post:entry'))) {
    my $type =  $xc->findvalue('./post:category[1]/@term', $entry);
    ...
}
[download]

And I can get the subsequent values by referencing post:category[2]/@term, or any other number.

I can even get all subsequent values using this xpath:

my $category = $xc->findvalue('./post:category[position()>1]/@term', $entry);

Unfortunately, though, when I use position()>1 it sticks all the values in one variable, like "labelanother labelblah". What's the best way to keep the values separate? when retrieving them with xpath?

Comment on Loop through multiple elements with same name in XPath Select or Download Code

Replies are listed 'Best First'.
Re: Loop through multiple elements with same name in XPath by ikegami (Patriarch) on May 30, 2009 at 16:58 UTC
You need `findnodes` if you want to return multiple results `for my $attr_node ( reverse $xc->findnodes('//post:entry/post:category/@term') ) { my $attr_val = $attr_node->getValue(); ... }` [download] Or maybe you want `for my $entry_node ( reverse $xc->findnodes('//post:entry') ) { for my $attr_node ( $xc->findnodes('post:category/@term', $entry_n +ode) ) { my $attr_val = $attr_node->getValue(); ... } ... }` [download]	[reply] [d/l] [select]
Re: Loop through multiple elements with same name in XPath by mirod (Canon) on May 30, 2009 at 09:01 UTC
Use `findnodes` and loop through the nodes. Something like (untested): `my $category = join " ", map { $_->getNodeText } $xc->findnodes('./post:category[position()>1]/@term', $entry);` updated:really use findnodes (thanks ikegami)	[reply] [d/l] [select]
Re: Loop through multiple elements with same name in XPath by Anonymous Monk on May 30, 2009 at 09:11 UTC
Unfortunately, though, when I use it sticks all the values in one variab See XML::XPath findvalue($path,_[$context]), findvalue always returns a string (XML::XPath::Literal)	[reply]
Re: Loop through multiple elements with same name in XPath by Jenda (Abbot) on May 30, 2009 at 20:51 UTC
I know I'm getting tiresome ... `my $parser = XML::Rules->new( rules => { _default => 'content', category => 'content array', entry => 'array no content', post => 'pass no content', } ); my $data = $parser->parse($the_xml); foreach my $entry (@$data) { print "Title: $entry->{title}\nContent: $entry->{content}\n"; if ($entry->{category)[0] eq 'blog') { print "It's a blog post\n\n"; } ...` [download] or my $parser = XML::Rules->new( rules => { _default => 'content', category => 'content array', entry => sub { my ($tag,$attr) = @_; delete $attr->{_content}; $attr->{type} = shift(@{$attr->{category}}); return '@entry' => $attr; }, post => 'pass no content', } ); my $data = $parser->parse($the_xml); foreach my $entry (@$data) { print "Title: $entry->{title}\nContent: $entry->{content}\n"; if ($entry->{type) eq 'blog') { print "It's a blog post with labels: ".join(', ', @{$entry->{categ +ory}})."\n\n"; } ... [download] or if you like you can handle each the <entry> within the sub specified in the rules and forget the value then and save memory. Jenda Enoch was right! Enjoy the last years of Rome.	[reply] [d/l] [select]