trebork has asked for the wisdom of the Perl Monks concerning the following question:

I am processing xml that is return to me. I can't figure out how to direct html::twig to manipulate the data structure into a format array of hashes that I an use with HTML::Template. To try to simplify the problem I set up a test driver as follows.
#!/usr/bin/perl -w use strict; use Data::Dumper; use XML::Twig; my $twig= new XML::Twig; my $xml=qq~<RateInfo> <displayCurrencyCode>USD</displayCurrencyCode> <DisplayNightlyRates size=\'3\'> <displayNightlyRate>298.95</displayNightlyRate> <displayNightlyRate>298.95</displayNightlyRate> <displayNightlyRate>348.95</displayNightlyRate> </DisplayNightlyRates> <displayRoomRate>1106.62</displayRoomRate> <chargeableRoomRateTotal>1106.62</chargeableRoomRateTotal> <chargeableRoomRateTaxesAndFees>159.77</chargeableRoomRateTaxe +sAndFees> <nativeCurrencyCode>USD</nativeCurrencyCode> <NativeNightlyRates size=\'3\'> <nativeNightlyRate>298.95</nativeNightlyRate> <nativeNightlyRate>298.95</nativeNightlyRate> <nativeNightlyRate>348.95</nativeNightlyRate> </NativeNightlyRates> <nativeRoomRate>1106.62</nativeRoomRate> <rateFrequency>B</rateFrequency> </RateInfo>~; $twig->parse($xml); # build the twig my $struct = $twig->simplify( forcearray => 1 ); print Dumper $struct;
Which ouputs this.
$VAR1 = { 'nativeCurrencyCode' => [ 'USD' ], 'displayCurrencyCode' => [ 'USD' ], 'chargeableRoomRateTaxesAndFees' => [ '159.77' ], 'rateFrequency' => [ 'B' ], 'chargeableRoomRateTotal' => [ '1106.62' ], 'NativeNightlyRates' => [ { 'nativeNightlyRate' => [ '298.95', '298.95', '348.95' ], 'size' => '3' } ], 'DisplayNightlyRates' => [ { 'size' => '3', 'displayNightlyRate' => [ '298.95', '298.95', '348.95' ] } ], 'displayRoomRate' => [ '1106.62' ], 'nativeRoomRate' => [ '1106.62' ] };
I have tried working with GroupTags and KeyAttr in the simplify function but have been unable to make a dent in he output. Can this be done via twig and simplify(); Or do I have to manipulate $struct directly? Thanks

Replies are listed 'Best First'.
Re: XML::Twig handling arrays
by mirod (Canon) on Dec 02, 2008 at 09:48 UTC

    The big problem with XML::Simple (which the simplify method in XML::Twig emulates) is how to specify what should end up as an array and what should end up as a hash in the resulting structure.

    In you case it seems that you want only NativeNightlyRates and DisplayNightlyRates as arrays, the rest of the data should be stored as hashes. You can get just that by, instead of forcing everything as an array with forcearray => 1, being more selective and writing forcearray => [ qw(NativeNightlyRates DisplayNightlyRates) ].

    The resulting structure is:

    $VAR1 = { 'nativeCurrencyCode' => 'USD', 'displayCurrencyCode' => 'USD', 'chargeableRoomRateTaxesAndFees' => '159.77', 'rateFrequency' => 'B', 'chargeableRoomRateTotal' => '1106.62', 'NativeNightlyRates' => { 'nativeNightlyRate' => [ '298.95', '298.95', '348.95' ], 'size' => '3' }, 'DisplayNightlyRates' => { 'size' => '3', 'displayNightlyRate' => [ '298.95', '298.95', '348.95' ] }, 'displayRoomRate' => '1106.62', 'nativeRoomRate' => '1106.62' };

    I am not sure that is exactly what you need to use with HTML::Template, which would seem to require arrays of hashes in NativeNightlyRates and DisplayNightlyRates. You could do this by processing either the data structure returned by simplify, or by working directly on the twig, before applying simplify. I would work on the twig, but you might have guessed that ;--)

    First I would turn the text content of the nativeNightlyRate and displayNightlyRate elements into an attribute, so you can get an array of hashes, then I would erase the DisplayNightlyRates/NativeNightlyRates layer, as it just makes it harder to access the data without helping much (the size attribute can be discarded, it's just the size of the array):

    #!/usr/bin/perl -w use strict; use Data::Dumper; use XML::Twig; my $twig= XML::Twig->new( twig_handlers => { nativeNightlyRate => su +b { content_to_att($_, 'rate'); }, displayNightlyRate => su +b { content_to_att($_, 'rate'); }, DisplayNightlyRates => su +b { $_->erase }, NativeNightlyRates => su +b { $_->erase }, } ); my $xml=qq~<RateInfo> <displayCurrencyCode>USD</displayCurrencyCode> <DisplayNightlyRates size=\'3\'> <displayNightlyRate>298.95</displayNightlyRate> <displayNightlyRate>298.95</displayNightlyRate> <displayNightlyRate>348.95</displayNightlyRate> </DisplayNightlyRates> <displayRoomRate>1106.62</displayRoomRate> <chargeableRoomRateTotal>1106.62</chargeableRoomRateTotal> <chargeableRoomRateTaxesAndFees>159.77</chargeableRoomRateTaxe +sAndFees> <nativeCurrencyCode>USD</nativeCurrencyCode> <NativeNightlyRates size=\'3\'> <nativeNightlyRate>298.95</nativeNightlyRate> <nativeNightlyRate>298.95</nativeNightlyRate> <nativeNightlyRate>348.95</nativeNightlyRate> </NativeNightlyRates> <nativeRoomRate>1106.62</nativeRoomRate> <rateFrequency>B</rateFrequency> </RateInfo>~; $twig->parse( $xml); # build the twig my $struct = $twig->simplify( forcearray => [ qw(NativeNightlyRate DisplayNightlyRate) ], ); print Dumper $struct; sub content_to_att { my( $elt, $att)= @_; $elt->set_att( $att => $elt->text)->cut_children; }

    This outputs:

    $VAR1 = { 'nativeCurrencyCode' => 'USD', 'displayCurrencyCode' => 'USD', 'chargeableRoomRateTaxesAndFees' => '159.77', 'rateFrequency' => 'B', 'chargeableRoomRateTotal' => '1106.62', 'displayNightlyRate' => [ { 'rate' => '298.95' }, { 'rate' => '298.95' }, { 'rate' => '348.95' } ], 'nativeNightlyRate' => [ { 'rate' => '298.95' }, { 'rate' => '298.95' }, { 'rate' => '348.95' } ], 'displayRoomRate' => '1106.62', 'nativeRoomRate' => '1106.62' };

    Is this what you were looking for?

Re: XML::Twig handling arrays
by poolpi (Hermit) on Dec 02, 2008 at 09:09 UTC

    Not really sure to understand your need,
    but if you want to indent your output,
    you can use this:

    my $t = XML::Twig ->new( pretty_print => 'indented' ) ->parse($xml) ->print;
    Output: <RateInfo> <displayCurrencyCode>USD</displayCurrencyCode> <DisplayNightlyRates size="3"> <displayNightlyRate>298.95</displayNightlyRate> <displayNightlyRate>298.95</displayNightlyRate> <displayNightlyRate>348.95</displayNightlyRate> </DisplayNightlyRates> <displayRoomRate>1106.62</displayRoomRate> <chargeableRoomRateTotal>1106.62</chargeableRoomRateTotal> <chargeableRoomRateTaxesAndFees>159.77</chargeableRoomRateTaxesAndFe +es> <nativeCurrencyCode>USD</nativeCurrencyCode> <NativeNightlyRates size="3"> <nativeNightlyRate>298.95</nativeNightlyRate> <nativeNightlyRate>298.95</nativeNightlyRate> <nativeNightlyRate>348.95</nativeNightlyRate> </NativeNightlyRates> <nativeRoomRate>1106.62</nativeRoomRate> <rateFrequency>B</rateFrequency> </RateInfo>

    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb
Re: XML::Twig handling arrays
by Jenda (Abbot) on Dec 02, 2008 at 14:50 UTC

    I'm not sure what do you want the resulting datastructure to look like, but you might find XML::Rules handy. If you need to extract data from XML and tweak the datastructure you get, XML::Rules is the tool. Unlike XML::Simple or XML::Twig::simplify it gives you much more detailed control.