Ah, right. I did not think of that.
I am an idiot, it seemed to work when I tested, but it was wrong. I forgot that the 'raw extended' doesn't put just the textual content of the tag, but the whole hash (of attributes and _content) into the parent tag's hash. The code should look like this:
#!/usr/bin/perl
use strict;
use warnings;
no warnings 'uninitialized';
use XML::Rules;
my $parser = XML::Rules->new(
style => 'filter', # we want to filter (modify) the XML, not extra
+ct data
rules => {
_default => 'raw', # we want to copy most tags intact, includi
+ng the whitespace in and around them
# the data of the tags will end up in the _content pseudoa
+ttribute of the parent tag
'category,subCategory,code' => 'raw extended',
# these three we need not only to copy, but also made easi
+er to access.
# The "raw extended" rule causes the data of that tag to b
+e available in the hash of the parent tag
# also as ":category", ":subCategory" and ":code" so you d
+o not have to search through the _content array
'ResultItem' => 'as array',
# we expect several <ResultItem> tags and want to store th
+e data of each in an array .
# the array will be accessible using the 'ResultItem' key
+in the hash containing the data of the parent tag
'results' => sub {
my ($tag,$attrs) = @_; # this is the Perl way to assign na
+mes to subroutine/function parameters
# this subroutine is called whenever the <results>...<
+/results> is fully parsed and the rules
# specified for the child tags evaluated.
if ($attrs->{ResultItem} and @{$attrs->{ResultItem}} > 1)
+{
# if there are any <ResultItem> tags and there's more
+than one
@{$attrs->{ResultItem}} = sort {
# sort allows you to specify the code to be us
+ed to compare the items to sort
# the items are made available as $a and $b to
+ the code.
# in this case the $a and $b are hashes create
+d by processing the child tags of the <ResultItem> tags.
$a->{':category'}{_content} cmp $b->{':categor
+y'}{_content}
or
$a->{':subCategory'}{_content} cmp $b->{':subC
+ategory'}{_content}
or
$a->{':code'}{_content} cmp $b->{':code'}{_con
+tent}
}
@{$attrs->{ResultItem}};
}
$attrs->{_content} =~ s/^\s+// if (!ref $attrs->{_content}
+);
# remove the accumulated whitespace that was present b
+etween the <ResultItem> tags
return [$tag => $attrs]
}
}
);
$parser->filterfile("test.msg", "test-result.msg");
It was comparing the references to hashes instead of the values.
Jenda
Enoch was right!
Enjoy the last years of Rome.
|