comment on

use strict;
use warnings;
no warnings 'uninitialized';

use XML::Rules;

my $parser = XML::Rules->new(
    style => 'filter',
    rules => {
        _default => 'raw',
        itemid => sub {
            my ($tag,$attrs,$context,$parents) = @_;
            $parents->[-4]{':PUI'} = $attrs->{_content} if $attrs->{id
+type} eq "PUI";
            return [$tag => $attrs]; # same thing the 'raw' built-in d
+oes
        },
        item => 'as array',
        bibdataset => sub {
            my ($tag,$attrs) = @_;
            @{$attrs->{item}} = sort {$a->{':PUI'} <=> $b->{':PUI'}} @
+{$attrs->{item}};
            $attrs->{_content} = [
                (map( ( "\n\t", [item => $_]), @{$attrs->{item}})),
                "\n",
            ];
            delete $attrs->{item};
            return $tag => $attrs;
        },
    }
);

$parser->filter(\*DATA);

__DATA__
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<bibdataset ...
[download]

Basicaly ... whenever an <itemid> tag is fully parsed (including content and end tag), the code checks whether the idtype eq "PUI" and if it does it remembers the content in the tag's parent's parent's parent's parent (i.e. the <item> tag ... attributes starting by a colon are never exported to the resulting XML) and then it add the tag's data into the parent's content. Then the <item> tags are removed from the parent tag's content and stored in an array stored in the parent tag's hash of attributes under key "item".

Then once the XML is fully parsed, the array of items is sorted, some whitespace gets inserted between the items and the resulting array becomes the contents of the root tag. And the tag with the attributes and content (including child tags) gets printed.

The code assumes the <itemid> will always be at the same level below <item> and that there will only <item> tags in bibdataset!

Jenda
Enoch was right!
Enjoy the last years of Rome.

In reply to Re: Sort xml based on attribute by Jenda
in thread Sort xml based on attribute by bharathinc

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.