ankit.tayal560 has asked for the wisdom of the Perl Monks concerning the following question:

small part of my huge xml file is : """<signal sigid="3464" id="3490"> </signal>""" I want to get access to signal element. Input known is only sigid value i.e. 3464 . how can I do this with the help of XML::DOM? P.S. my xml file contains huge number of signal elements with different sigid values.

  • Comment on How to parse a xml file using XML::DOM??

Replies are listed 'Best First'.
Re: How to parse a xml file using XML::DOM??
by choroba (Cardinal) on Sep 22, 2016 at 09:48 UTC
    You can also use Perl's power to search for a value in a list:

    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use XML::DOM; my $doc = 'XML::DOM::Parser'->new->parsefile(shift); say $_->getAttribute('id') for grep 3464 eq $_->getAttribute('sigid'), $doc->getElementsByTagName('signal');

    If your XML file is really huge, you might need to switch to a pull or SAX parser. I'm not sure whether XML::DOM provides such an interface.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      As it says, XML::DOM is not a streaming or event parser, it's a document parser. So, it stores the entire XML in memory and works the tree. Not very useful for the task at hand, is it? :-)
Re: How to parse a xml file using XML::DOM??
by Corion (Patriarch) on Sep 22, 2016 at 09:23 UTC

    Personally, I would use XML::DOM::XPath, locate the element through the appropriate XPath query (//signal[@sigid="3464"]) and then output its value.

      Got your point but for some reason have to use XML::DOM only. Can you help me with that ?

        Looking at the documentation of XML::DOM, it shows the use of ->getElementsByTagName. The simplest approach would then be something like:

        my $signals = $doc->getElementsByTagName('signal'); my @by_id = grep { my $sigid= $_->getAttributeNode('sigid'); if( $sigid ) { $sigid == 3464 } else { 0 # no sigid attribute, no match } } map { $signals->item($_) } 0..$signals->getLength()-1

        I still recommend using an XPath query.

Re: How to parse a xml file using XML::DOM??
by rminner (Chaplain) on Sep 22, 2016 at 12:09 UTC

    Personally i think XML::DOM is not going to be the right solution, if your input xml file is very large. My perfered choice for parsing xml is XML::Twig. I am using it to parse very large files, and it is doing it quickly with a low memory usage. The same may apply for other modules, however i am most familiar with XML::Twig.

    I whipped up a short example, on how you can do the parsing with XML::Twig. Since i do not know exactly what you intend to do, i added several example method calls to get you on the right track (in case you ever decide to use it).
    use strict; use warnings; use Data::Dumper; #use Data::Dumper::Concise; # i prefer Data::Dumper::Concise use XML::Twig; # individually process each <signal> element sub signal_handler { my ($data, $twig, $elem) = @_; # get the attributes of $elem (<signal>) my $atts = $elem->atts(); if ($atts->{'sigid'} == 3464) { print "Found <signal> with sigid == 3464:\n",$elem->sp +rint(),"\n"; print "<PRESS ENTER TO CONTINUE>";<STDIN>; } # if you want to access the element in a way similar to XML::S +imple: my $xml_simple_style_elem = $elem->simplify(); # check out the simplified structure: print Dumper($xml_simple_style_elem); print "<PRESS ENTER TO CONTINUE>";<STDIN>; # Example for Data Collection: my ($sigid, $id) = @{$atts}{qw/sigid id/}; if (defined $sigid and defined $id) { $data->{sigid_id_count}{$sigid}{$id}++; } # get all elements below <signal> which are called <foo> my @foo_subelements = $elem->descendants('foo'); $twig->purge; # explicitly free the memory }; sub main { my $fn = shift @ARGV; my %collected_data; my $twig = XML::Twig->new( twig_roots => { 'signal' => sub {signal_handler(\%collect +ed_data, @_);}, }, ); eval { $twig->parsefile($fn); }; if ($@) { print STDERR "Failed to parse '$fn' ($@)\n"; } if (%collected_data) { print "I collected the following data:\n",Dumper(\%col +lected_data); } } main();
Re: How to parse a xml file using XML::DOM??
by choroba (Cardinal) on Sep 22, 2016 at 15:45 UTC
    For large files, I usually use XML::LibXML::Reader, which is a pull parser.
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; use XML::LibXML::Reader; my $file = shift; my $reader = 'XML::LibXML::Reader'->new(location => $file) or die "Cannot read $file.\n"; while ($reader->nextElement('signal')) { say $reader->getAttribute('id') if '3464' eq $reader->getAttribute('sigid'); }

    Similarly, in XML::XSH2, a wrapper around XML::LibXML :

    stream :N :f file.xml select signal[@sigid=3464] { echo @id }

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,