in reply to Sort xml based on attribute
use strict; use warnings; no warnings 'uninitialized'; use XML::Rules; my $parser = XML::Rules->new( style => 'filter', rules => { _default => 'raw', itemid => sub { my ($tag,$attrs,$context,$parents) = @_; $parents->[-4]{':PUI'} = $attrs->{_content} if $attrs->{id +type} eq "PUI"; return [$tag => $attrs]; # same thing the 'raw' built-in d +oes }, item => 'as array', bibdataset => sub { my ($tag,$attrs) = @_; @{$attrs->{item}} = sort {$a->{':PUI'} <=> $b->{':PUI'}} @ +{$attrs->{item}}; $attrs->{_content} = [ (map( ( "\n\t", [item => $_]), @{$attrs->{item}})), "\n", ]; delete $attrs->{item}; return $tag => $attrs; }, } ); $parser->filter(\*DATA); __DATA__ <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <bibdataset ...
Basicaly ... whenever an <itemid> tag is fully parsed (including content and end tag), the code checks whether the idtype eq "PUI" and if it does it remembers the content in the tag's parent's parent's parent's parent (i.e. the <item> tag ... attributes starting by a colon are never exported to the resulting XML) and then it add the tag's data into the parent's content. Then the <item> tags are removed from the parent tag's content and stored in an array stored in the parent tag's hash of attributes under key "item".
Then once the XML is fully parsed, the array of items is sorted, some whitespace gets inserted between the items and the resulting array becomes the contents of the root tag. And the tag with the attributes and content (including child tags) gets printed.
The code assumes the <itemid> will always be at the same level below <item> and that there will only <item> tags in bibdataset!
Jenda
Enoch was right!
Enjoy the last years of Rome.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Sort xml based on attribute
by Anonymous Monk on Aug 12, 2010 at 02:35 UTC | |
by Jenda (Abbot) on Aug 12, 2010 at 10:20 UTC | |
by Anonymous Monk on Aug 12, 2010 at 11:48 UTC | |
by Jenda (Abbot) on Aug 12, 2010 at 12:53 UTC | |
by Anonymous Monk on Aug 12, 2010 at 18:07 UTC | |
| |
by Anonymous Monk on Aug 12, 2010 at 11:53 UTC |