Hi all

I am trying to parse a multi-level XML document, and then adding certain nodes to a hash. The problem is that with the many different levels I am struggling with the referencing of the nodes that I want to add

This is the code I have

#!/usr/bin/perl use XML::Simple; use Data::Dumper; use utf8; open (INPUT, "<:utf8", "w.xml") or die "Can't open"; my $xml = new XML::Simple (KeyAttr=>[]); my $afWN = $xml->XMLin("wnafr.xml"); open (OUTPUT, ">:utf8", "Output.txt") or die "Can't open"; print OUTPUT Dumper($W); foreach my $e (@{$W->{XML}}) { print "I.D.:", $e->{ID}, "\n"; print "Part of Speech: ", $e->{POS}, "\n"; print "Literal: ", "$e->{SYNONYM}{LITERAL}{content}", "\n"; print "\n"; } close (OUTPUT);

The XML looks like this

<XML><ID>ENG20-00001740-a</ID><POS>a</POS><SYNONYM><LITERAL sense="1">in staat</LITERAL><WORD>in</WORD><WORD>staat</WORD></SYNONYM><ILR type="be_in_state">ENG20-05295659-n</ILR><ILR type="be_in_state">ENG20-04904666-n</ILR><ILR type="near_antonym">ENG20-00002062-a</ILR><DEF/><USAGE/><BCS>3</BCS><DOMAIN>quality</DOMAIN><SUMO type="=">Breathing</SUMO></SYNSET></XML>

<XML><ID>ENG20-00001740-v</ID><POS>v</POS><SYNONYM><LITERAL sense="1">asem</LITERAL><WORD>asem</WORD><LITERAL sense="1">respireer</LITERAL><WORD>respireer</WORD><LITERAL sense="1">asem skep</LITERAL><WORD>asem</WORD><WORD>skep</WORD><LITERAL sense="1">asemhaal</LITERAL><WORD>asemhaal</WORD></SYNONYM><ILR type="verb_group">ENG20-00002536-v</ILR><ILR type="verb_group">ENG20-00002307-v</ILR><ILR type="also_see">ENG20-00004923-v</ILR><ILR type="also_see">ENG20-00004127-v</ILR><ILR type="subevent">ENG20-00004923-v</ILR><ILR type="subevent">ENG20-00004127-v</ILR><DEF/><USAGE/><BCS>3</BCS><DOMAIN>medicine</DOMAIN><SUMO type="=">Breathing</SUMO></SYNSET></XML>

I want to put all the Literal->content in a separate hash, but the problem is that there can occur more that 1. If there is more than 1 Literal->content the XML parser stores them in a array of hashes. If there is only 1 Literal->content it is simply stores as a hash.

Any ideas?


In reply to XML Parser and certain nodes to a hash by Dr Manhattan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.