Just in case anyone is interested, here are two implementations using XML::Rules. The first one prints the reactions as it parses through the XML, the other builds a more simplified data structure.

use strict; use warnings; use XML::Rules; use Data::Dumper; my $parser = XML::Rules->new( rules => [ substrate => sub {return '@substrates' => $_[1]->{name}}, product => sub {return '@products' => $_[1]->{name}}, reaction => sub { print "$_[1]->{name} ($_[1]->{type}):\n"; print "\tSubstrates: ", join(", ", @{$_[1]->{substrates}}) +, "\n"; print "\tProducts: ", join(", ", @{$_[1]->{products}}), +"\n\n"; return; }, ], ); $parser->parse(\*DATA); __DATA__ <root> <reaction name="rn:R00710" type="reversible"> <substrate name="cpd:C00084"/> <product name="cpd:C00033"/> </reaction> <reaction name="rn:R00014" type="irreversible"> <substrate name="cpd:C00068"/> <substrate name="cpd:C00022"/> <product name="cpd:C05125"/> </reaction> </root>
and
use strict; use warnings; use XML::Rules; use Data::Dumper; my $parser = XML::Rules->new( rules => [ substrate => sub {return '@substrates' => $_[1]->{name}}, product => sub {return '@products' => $_[1]->{name}}, reaction => sub { my $name = delete($_[1]->{name}); delete($_[1]->{_content}); return $name => $_[1]; }, root => 'pass no content', ], ); my $result = $parser->parse(\*DATA); print Dumper($result); while (my ($reaction, $data) = each(%$result)) { print "$reaction ($data->{type})\n"; print "\tSubstrates: ", join(", ", @{$data->{substrates}}), "\n"; print "\tProducts: ", join(", ", @{$data->{products}}), "\n\n"; } __DATA__ <root> <reaction name="rn:R00710" type="reversible"> <substrate name="cpd:C00084"/> <product name="cpd:C00033"/> </reaction> <reaction name="rn:R00014" type="irreversible"> <substrate name="cpd:C00068"/> <substrate name="cpd:C00022"/> <product name="cpd:C05125"/> </reaction> </root>

The substrate and product rules could be rewriten like this:

'substrate,product' => sub {return '@'.$_[0].'s' => $_[1]->{name}},


In reply to Re^2: XML parsing Help.. by Jenda
in thread XML parsing Help.. by bioswami

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.