http://qs1969.pair.com?node_id=437365

mifflin has asked for the wisdom of the Perl Monks concerning the following question:

I've got a simple xml document ...
<?xml version="1.0" encoding="UTF-8" ?> <charge-type-control> <charge code="FRT" name="frtamt"/> <charge code="SV1" name="savamt"/> <charge code="SV2" name="savamt"/> <charge code="SV3" name="savamt"/> <charge code="SV4" name="savamt"/> <charge code="SV5" name="savamt"/> </charge-type-control>
that I would like to parse into the data structure ...
{ 'SV1' => 'savamt', 'SV5' => 'savamt', 'SV2' => 'savamt', 'FRT' => 'frtamt', 'SV3' => 'savamt', 'SV4' => 'savamt' };
However, the closest I've been able to get with XML::Simple is ...
{ 'charge' => { 'SV1' => { 'name' => 'savamt' }, 'SV5' => { 'name' => 'savamt' }, 'SV2' => { 'name' => 'savamt' }, 'SV4' => { 'name' => 'savamt' }, 'SV3' => { 'name' => 'savamt' }, 'FRT' => { 'name' => 'frtamt' } } };
using this XMLin line ...
$ref = XMLin('charge-type-control.xml', KeyAttr => ['code']);
the docs allude to using ValueAttr like...
$ref = XMLin('charge-type-control.xml', KeyAttr => ['code'], ValueAttr + => ['name']);

but the results are the same. So either it doesn't do what I think it does or what I want to do is impossible. Can I coerce XML::Simple to do exactly what I want or do I have to write my own transformer like...

my %myref; for my $key (keys %{$ref->{charge}}) { $myref{$key} = $ref->{charge}->{$key}->{name}; }
to get exactly what I want?

Replies are listed 'Best First'.
Re: XML::Simple help
by saintmike (Vicar) on Mar 08, 2005 at 00:47 UTC
    Not sure if XML::Simple can do it, but a little post processing will do:
    my $ref = XMLin($data, KeyAttr => { charge => "code"}); my %hash = map { ($_, $ref->{charge}->{$_}->{name}) } keys %{$ref->{charge}}; $ref = \%hash;
      That's something I'd like to have in a class.
      package ChargeTypeXML; sub new { my $class = shift; my %args = shift; my $ref = XMLin($args->{file}, KeyAttr => { charge => "code"}); my %hash = map { ($_, $ref->{charge}->{$_}->{name}) } keys %{$ref->{charge}}; $ref = \%hash; bless $ref, $class; return $ref; } sub toXML { print qq{<?xml version="1.0" encoding="UTF-8" ?>\n}; print qq{<charge-type-control>\n}; for keys ( %{$self} ) { print qq{<charge code="$_" name="}, $self->{$_}, qq{frtamt"/>\n}; } print qq{</charge-type-control>}; } 1; #in the script: use ChargeTypeXML; my $xml = ChargeTypeXML->new (file=>"xmlfile");

      Of course this only works, when all "codes" in the xml are unique.

      Note: untested


      holli, /regexed monk/
Re: XML::Simple help
by graff (Chancellor) on Mar 08, 2005 at 03:38 UTC
    Looking at the man page for XML::Simple, it looks like the "ValueAttr" option does not really fit with the data that you have. It looks like this option is intended for cases where lots of different elements have a common attribute, and you want to have this parsed into a structure where the element names are hash keys and the values of the common attribute are hash values.

    In your case, you have a lot of instances with the same element name and two common attributes, where one attribute serves as a "key" field (unique to each instance of the element) and the other does not; in this case, the "KeyAttr" option works well, and there simply isn't anything else that the module can do for you. The initial reply above is the way to go to get what you want.

Re: XML::Simple help
by murugu (Curate) on Mar 08, 2005 at 06:23 UTC

    Here is my try with XML::Twig.

    use XML::Twig; my $hash; my $t= XML::Twig->new( twig_handlers => { charge => sub{ $hash->{$_[1]->{'att'}-> +{'code'} }=$_[1]->{'att'}->{'name'}}, }, ); $t->parse( '<charge-type-control> <charge code="FRT" name="frtamt"/> <charge code="SV1" name="savamt"/> <charge code="SV2" name="savamt"/> <charge code="SV3" name="savamt"/> <charge code="SV4" name="savamt"/> <charge code="SV5" name="savamt"/> </charge-type-control>'); print $_."\t".$hash->{$_}."\n" for keys %{$hash};

    regards
    Murugesan Kandasamy

      A couple of comments:

      • in a handler you can use $_ as an alias for $_[1]
      • unless speed is a big concern, you should really use the proper methods to access attributes, and write $_->att( 'code') instead of $_->{'att'}->{'code'} and $_->set_att( code => $value) instead of $_->{'att'}->{'code'}= $value: there is no guaranty that the attributes will always be in a simple hash, nor than the methods only read or assign to the att hash (they do at the moment, but is very likely to change).

      With those changes, the handler code becomes:

      sub{    $hash->{$_->att('code')}=$_->att->( 'name'); }