jjap has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am navigating a nested XML data structure and at some point, I want to retrieve the value of a tag having a specific attribute. I cannot figure out why I am not getting anything from the minimal reproducible below (return 0).

use strict; use XML::LibXML; my $parser = XML::LibXML->new; my $doc = $parser->parse_file("minimal.xml"); my @vol = $doc->findnodes(q {Volume[@VolumeCategory="L"]}); # also tried various quoting scheme... my $tmp = scalar(@vol); print "Number of entries: $tmp \n"; # 0
minimal.xml
<root> <ProductKey>99</ProductKey> <Volume VolumeCategory="L" MeasurementCategory="Real">0.063</Volume> <Volume VolumeCategory="cuft" MeasurementCategory="Real">2.2</Volume> </root>
Any hints would be greatly appreciated!

Update: Thanks to both Ikegami and Toolic for great pointers.
Also did some spelling edit.

Update 2: It turns out my problem was related with namespace (in my large document, not the toy example I made for this post). As alluded to in numerous other postings, XML::LibXML::XPath offered the registerNs method which then made XPath expressions find everything I needed. Thanks to all for your inputs, guidance and alternate solutions.

Replies are listed 'Best First'.
Re: Finding node with attribute XML::LibXML
by ikegami (Patriarch) on Sep 30, 2014 at 17:51 UTC
    Volume[@VolumeCategory="L"]
    relative to the root of the document (/) means
    /Volume[@VolumeCategory="L"]

    They fail to find a match since there's no element named Volume at the root of the document.

    /root/Volume[@VolumeCategory="L"]

    would find the node in question, and so would its relative equivalent

    root/Volume[@VolumeCategory="L"]

    It's just like directories.

    $ cd / $ ls -d ikegami ls: cannot access ikegami: No such file or directory $ ls -d /ikegami ls: cannot access /ikegami: No such file or directory $ ls -d /home/ikegami /home/ikegami $ ls -d home/ikegami home/ikegami
      Thanks for the reply.

      I did muff my example as I had been using getElementsByTagName which spared me the need to qualify the full path if may say.

      Should I understand that that methods does not support attribute?
        If you want to search the whole document, you can use
        /descendant::Volume[@VolumeCategory="L"]
        which can be abbreviated to
        //Volume[@VolumeCategory="L"]

        Should I understand that that methods does not support attribute?

        If you're asking if getElementsByTagName can filter the returned nodes to retain only those with a specific attribute value, then the answer is no. The only argument it takes is a tag name. You could do

        my @matching = grep { my $vc = $_->getAttribute('VolumeCategory'); defined($vc) && $vc eq "L" } $doc->getElementsByTagName('Volume');

        but aforementioned

        my @matching = $doc->findnodes('//Volume[@VolumeCategory="L"]');

        is much simpler.

Re: Finding node with attribute XML::LibXML
by toolic (Bishop) on Sep 30, 2014 at 17:59 UTC
Re: Finding node with attribute XML::LibXML
by codiac (Beadle) on Oct 01, 2014 at 01:25 UTC
    This is not an LibXML solution, but I thought I'd paste it in cases anyone prefers using more perlish syntax for munging XML.
    use strict; use warnings; use XML::TreeBuilder; my $doc = XML::TreeBuilder->new(); $doc->parse_file("minimal.xml"); my @vol = $doc->look_down(_tag => 'Volume', VolumeCategory => 'L'); # also tried various quoting scheme... my $tmp = scalar(@vol); print "Number of entries: $tmp \n"; # 1
      ..and this a XML::Twig solution.. ;=)
      use warnings; use strict; use XML::Twig; my $xml=<<'XML'; <root> <ProductKey>99</ProductKey> <Volume VolumeCategory="L" MeasurementCategory="Real">0.063</Volume> <Volume VolumeCategory="L" MeasurementCategory="Real">0.098989</Volume +> <Volume VolumeCategory="cuft" MeasurementCategory="Real">2.2</Volume> </root> XML my @vol_cat_L; my $twig= new XML::Twig( twig_handlers => { '/root/Volume[@VolumeCate +gory="L"]' => sub {push @vol_cat_L, $_}, }, ); $twig->parse( $xml); print "Number of L volume category: ",scalar @vol_cat_L,"\n"; ##OUTPUT (note the line in example data added by me): Number of L volume category: 2

      HtH
      L*
      There are no rules, there are no thumbs..
      Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: Finding node with attribute XML::LibXML
by stylechief (Beadle) on Oct 01, 2014 at 00:58 UTC
    Another method is to form queries targeting the item of interest, parse, then use "findvalue":
    my $xquery = '//metadata/audience/@name';# find the first <audience n +ame="xyz"> my $xquery2 = "//othermeta[\@name = 'plugin.id']/\@content";# find an +othermeta with a name = plugin.id and @content my $xquery3 = '//@xml:lang'; # find first @xml:lang my $xparser = XML::LibXML->new(); # create new parser object eval { $xmldoc = $xparser->parse_file($file) }; # trap parsing errors + that can't be suppressed normally, like &nbsp; entities if ($@){ print "Problems! \n The error(s):\n$@\n"; } if ($pluginname = $xmldoc->findvalue( $xquery2 )){ # returns value +of plugin name print " Plugin name found: $pluginname\n"; }
    ETC...
    SC
Re: Finding node with attribute XML::LibXML
by Anonymous Monk on Oct 01, 2014 at 20:40 UTC
    Speaking of namespaces :) xpather.pl shows you don't need to registerNs
    /*[ local-name() = "root" and position() = 1 ] /*[ local-name() = "Volume" and @VolumeCategory = "cuft" and @MeasurementCategory = "Real" and contains(string(), "2.2") ]
      I'll have to give that a try... Thanks.