in reply to Re^4: Sort xml based on attribute
in thread Sort xml based on attribute

Re 1) and 2): This particular script takes the data from a special filehandle DATA that allows you to read the text that follows the __DATA__ marker in the script. If you want to process a file instead either open the file and pass the filehandle:

open IN, '<', $filename or die "..."; $parser->filter(\*IN);
or
open my $IN, '<', $filename or die "..."; $parser->filter($IN);
or use the filterfile() method
$parser->filterfile($filename);

Re 3) I did not include the whole XML at the end of the script, so maybe that's where there's the problem. Drop the __DATA__ and everything after that and use the filterfile() method.

There are a few posts related to XML::Rules on Perlmonks, try to find them and see if they help. I tried to explain the design of the module in some of those. For example in (RFC) XML::TransformRules, (RFC) XML::Rules - yet another XML parser and Simpler than XML::Simple.

Jenda
Enoch was right!
Enjoy the last years of Rome.

Replies are listed 'Best First'.
Re^6: Sort xml based on attribute
by Anonymous Monk on Aug 12, 2010 at 18:07 UTC
    Thanks for the information. Unfortunately it doesn't seem to work. Perl doesn't give me an error, the file just seems to stay exactly the same.

    ORIGINAL:
    <?xml version="1.0"?> <ResultDetail> <results> <ResultItem> <category>AGM</category> <subCategory>VAL</subCategory> <code1>010300</code1> <name>client.NotEntered</name> <type>ERR</type> <flags>320</flags> <language>EN</language> <description>Client not entered</description> <cause>Invalid data field values</cause> <action>Correct the problem and send the request again</action +> </ResultItem> <ResultItem> <category>AGM</category> <subCategory>VAL</subCategory> <code1>010400</code1> <name>client.notFound</name> <type>ERR</type> <flags>320</flags> <language>EN</language> <description>Client not found</description> <cause>Invalid data field values</cause> <action>Correct the problem and send the request again</action +> </ResultItem> <ResultItem> <category>AGM</category> <subCategory>VAL</subCategory> <code1>010000</code1> <name>parse</name> <type>ERR</type> <flags>320</flags> <language>EN</language> <description>Parse error</description> <cause>Parse error in the input XML</cause> <action>Correct the error and send your request again</action> </ResultItem> </results> </ResultDetail>
    RESULT:
    <?xml version="1.0"?> <ResultDetail> <results><ResultItem> <category>AGM</category> <subCategory>VAL</subCategory> <code1>010300</code1> <name>client.NotEntered</name> <type>ERR</type> <flags>320</flags> <language>EN</language> <description>Client not entered</description> <cause>Invalid data field values</cause> <action>Correct the problem and send the request again</action +> </ResultItem><ResultItem> <category>AGM</category> <subCategory>VAL</subCategory> <code1>010400</code1> <name>client.notFound</name> <type>ERR</type> <flags>320</flags> <language>EN</language> <description>Client not found</description> <cause>Invalid data field values</cause> <action>Correct the problem and send the request again</action +> </ResultItem><ResultItem> <category>AGM</category> <subCategory>VAL</subCategory> <code1>010000</code1> <name>parse</name> <type>ERR</type> <flags>320</flags> <language>EN</language> <description>Parse error</description> <cause>Parse error in the input XML</cause> <action>Correct the error and send your request again</action> </ResultItem></results> </ResultDetail>
    The perl script (modified only slightly) is here:
    #!/usr/bin/perl use strict; use warnings; no warnings 'uninitialized'; use XML::Rules; my $parser = XML::Rules->new( style => 'filter', # we want to filter (modify) the XML, not extra +ct data rules => { _default => 'raw', # we want to copy most tags intact, includi +ng the whitespace in and around them # the data of the tags will end up in the _content pseudoa +ttribute of the parent tag 'category,subCategory,code' => 'raw extended', # these three we need not only to copy, but also made easi +er to access. # The "raw extended" rule causes the data of that tag to b +e available in the hash of the parent tag # also as ":category", ":subCategory" and ":code" so you d +o not have to search through the _content array 'ResultItem' => 'as array', # we expect several <ResultItem> tags and want to store th +e data of each in an array . # the array will be accessible using the 'ResultItem' key +in the hash containing the data of the parent tag 'results' => sub { my ($tag,$attrs) = @_; # this is the Perl way to assign na +mes to subroutine/function parameters # this subroutine is called whenever the <results>...< +/results> is fully parsed and the rules # specified for the child tags evaluated. if ($attrs->{ResultItem} and @{$attrs->{ResultItem}} > 1) +{ # if there are any <ResultItem> tags and there's more +than one @{$attrs->{ResultItem}} = sort { # sort allows you to specify the code to be us +ed to compare the items to sort # the items are made available as $a and $b to + the code. # in this case the $a and $b are hashes create +d by processing the child tags of the <ResultItem> tags. $a->{':category'} cmp $b->{':category'} or $a->{':subCategory'} cmp $b->{':subCategory'} or $a->{':code'} cmp $b->{':code'} } @{$attrs->{ResultItem}}; } $attrs->{_content} =~ s/^\s+// if (!ref $attrs->{_content} +); # remove the accumulated whitespace that was present b +etween the <ResultItem> tags return [$tag => $attrs] } } ); $parser->filterfile("test.msg", "test-result.msg");
    I imagine that I'm being quite the pest, but I do appreciate any and all help you can give me.

      The third tag to sort on is <code1> not <code> according to what you wrote originally and what the XML contains. But your version of the script references <code>! See line 13 and 34 of the script.

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

        The XML actually uses <code> not <code1> I had to change it because to format the script as code for the website they use <code> tags. Any other suggestions?