Re^6: Sort xml based on attribute

Thanks for the information. Unfortunately it doesn't seem to work. Perl doesn't give me an error, the file just seems to stay exactly the same.

ORIGINAL:

<?xml version="1.0"?>
<ResultDetail>
<results>
    <ResultItem>
        <category>AGM</category>
        <subCategory>VAL</subCategory>
        <code1>010300</code1>
        <name>client.NotEntered</name>
        <type>ERR</type>
        <flags>320</flags>
        <language>EN</language>
        <description>Client not entered</description>
        <cause>Invalid data field values</cause>
        <action>Correct the problem and send the request again</action
+>
    </ResultItem>
    <ResultItem>
        <category>AGM</category>
        <subCategory>VAL</subCategory>
        <code1>010400</code1>
        <name>client.notFound</name>
        <type>ERR</type>
        <flags>320</flags>
        <language>EN</language>
        <description>Client not found</description>
        <cause>Invalid data field values</cause>
        <action>Correct the problem and send the request again</action
+>
    </ResultItem>
    <ResultItem>
        <category>AGM</category>
        <subCategory>VAL</subCategory>
        <code1>010000</code1>
        <name>parse</name>
        <type>ERR</type>
        <flags>320</flags>
        <language>EN</language>
        <description>Parse error</description>
        <cause>Parse error in the input XML</cause>
        <action>Correct the error and send your request again</action>
    </ResultItem>
</results>
</ResultDetail>
[download]

RESULT:

<?xml version="1.0"?>
<ResultDetail>
<results><ResultItem>
        <category>AGM</category>
        <subCategory>VAL</subCategory>
        <code1>010300</code1>
        <name>client.NotEntered</name>
        <type>ERR</type>
        <flags>320</flags>
        <language>EN</language>
        <description>Client not entered</description>
        <cause>Invalid data field values</cause>
        <action>Correct the problem and send the request again</action
+>
    </ResultItem><ResultItem>
        <category>AGM</category>
        <subCategory>VAL</subCategory>
        <code1>010400</code1>
        <name>client.notFound</name>
        <type>ERR</type>
        <flags>320</flags>
        <language>EN</language>
        <description>Client not found</description>
        <cause>Invalid data field values</cause>
        <action>Correct the problem and send the request again</action
+>
    </ResultItem><ResultItem>
        <category>AGM</category>
        <subCategory>VAL</subCategory>
        <code1>010000</code1>
        <name>parse</name>
        <type>ERR</type>
        <flags>320</flags>
        <language>EN</language>
        <description>Parse error</description>
        <cause>Parse error in the input XML</cause>
        <action>Correct the error and send your request again</action>
    </ResultItem></results>
</ResultDetail>
[download]

The perl script (modified only slightly) is here:

#!/usr/bin/perl
use strict;
use warnings;
no warnings 'uninitialized';

use XML::Rules;

my $parser = XML::Rules->new(
    style => 'filter', # we want to filter (modify) the XML, not extra
+ct data
    rules => {
        _default => 'raw', # we want to copy most tags intact, includi
+ng the whitespace in and around them
            # the data of the tags will end up in the _content pseudoa
+ttribute of the parent tag
        'category,subCategory,code' => 'raw extended',
            # these three we need not only to copy, but also made easi
+er to access.
            # The "raw extended" rule causes the data of that tag to b
+e available in the hash of the parent tag
            # also as ":category", ":subCategory" and ":code" so you d
+o not have to search through the _content array
        'ResultItem' => 'as array',
            # we expect several <ResultItem> tags and want to store th
+e data of each in an array .
            # the array will be accessible using the 'ResultItem' key 
+in the hash containing the data of the parent tag
        'results' => sub {
            my ($tag,$attrs) = @_; # this is the Perl way to assign na
+mes to subroutine/function parameters
                # this subroutine is called whenever the <results>...<
+/results> is fully parsed and the rules
                # specified for the child tags evaluated.
            if ($attrs->{ResultItem} and @{$attrs->{ResultItem}} > 1) 
+{
                # if there are any <ResultItem> tags and there's more 
+than one
                @{$attrs->{ResultItem}} = sort {
                        # sort allows you to specify the code to be us
+ed to compare the items to sort
                        # the items are made available as $a and $b to
+ the code.
                        # in this case the $a and $b are hashes create
+d by processing the child tags of the <ResultItem> tags.
                        $a->{':category'} cmp $b->{':category'}
                        or
                        $a->{':subCategory'} cmp $b->{':subCategory'}
                        or
                        $a->{':code'} cmp $b->{':code'}
                    }
                    @{$attrs->{ResultItem}};
            }
            $attrs->{_content} =~ s/^\s+// if (!ref $attrs->{_content}
+);
                # remove the accumulated whitespace that was present b
+etween the <ResultItem> tags
            return [$tag => $attrs]
        }
    }
);

$parser->filterfile("test.msg", "test-result.msg");
[download]

I imagine that I'm being quite the pest, but I do appreciate any and all help you can give me.

Comment on Re^6: Sort xml based on attribute Select or Download Code

Replies are listed 'Best First'.
Re^7: Sort xml based on attribute by Jenda (Abbot) on Aug 12, 2010 at 19:52 UTC
The third tag to sort on is <code1> not <code> according to what you wrote originally and what the XML contains. But your version of the script references <code>! See line 13 and 34 of the script. Jenda Enoch was right! Enjoy the last years of Rome.	[reply]
Re^8: Sort xml based on attribute by mdangelo (Initiate) on Aug 12, 2010 at 23:16 UTC
The XML actually uses <code> not <code1> I had to change it because to format the script as code for the website they use <code> tags. Any other suggestions?	[reply]
Re^9: Sort xml based on attribute by Jenda (Abbot) on Aug 12, 2010 at 23:26 UTC
Ah, right. I did not think of that. I am an idiot, it seemed to work when I tested, but it was wrong. I forgot that the 'raw extended' doesn't put just the textual content of the tag, but the whole hash (of attributes and _content) into the parent tag's hash. The code should look like this: #!/usr/bin/perl use strict; use warnings; no warnings 'uninitialized'; use XML::Rules; my $parser = XML::Rules->new( style => 'filter', # we want to filter (modify) the XML, not extra +ct data rules => { _default => 'raw', # we want to copy most tags intact, includi +ng the whitespace in and around them # the data of the tags will end up in the _content pseudoa +ttribute of the parent tag 'category,subCategory,code' => 'raw extended', # these three we need not only to copy, but also made easi +er to access. # The "raw extended" rule causes the data of that tag to b +e available in the hash of the parent tag # also as ":category", ":subCategory" and ":code" so you d +o not have to search through the _content array 'ResultItem' => 'as array', # we expect several <ResultItem> tags and want to store th +e data of each in an array . # the array will be accessible using the 'ResultItem' key +in the hash containing the data of the parent tag 'results' => sub { my ($tag,$attrs) = @_; # this is the Perl way to assign na +mes to subroutine/function parameters # this subroutine is called whenever the <results>...< +/results> is fully parsed and the rules # specified for the child tags evaluated. if ($attrs->{ResultItem} and @{$attrs->{ResultItem}} > 1) +{ # if there are any <ResultItem> tags and there's more +than one @{$attrs->{ResultItem}} = sort { # sort allows you to specify the code to be us +ed to compare the items to sort # the items are made available as $a and $b to + the code. # in this case the $a and $b are hashes create +d by processing the child tags of the <ResultItem> tags. $a->{':category'}{_content} cmp $b->{':categor +y'}{_content} or $a->{':subCategory'}{_content} cmp $b->{':subC +ategory'}{_content} or $a->{':code'}{_content} cmp $b->{':code'}{_con +tent} } @{$attrs->{ResultItem}}; } $attrs->{_content} =~ s/^\s+// if (!ref $attrs->{_content} +); # remove the accumulated whitespace that was present b +etween the <ResultItem> tags return [$tag => $attrs] } } ); $parser->filterfile("test.msg", "test-result.msg"); [download] It was comparing the references to hashes instead of the values. Jenda Enoch was right! Enjoy the last years of Rome.	[reply] [d/l]
Re^10: Sort xml based on attribute by mdangelo (Initiate) on Aug 12, 2010 at 23:47 UTC