in reply to XML::Twig replace method behaving counter-intuitively

I know it's not really an answer to your question, but what about:

use XML::Rules; my $parser = XML::Rules->new( style => "filter", rules => [ _default => 'raw', tweak => sub { my ($tag, $attr, undef, undef, $parser) = @_; if (exists $parser->{pad}{$attr->{name}}) { %{$parser->{pad}{$attr->{name}}} = %$attr; return; } else { $parser->{pad}{$attr->{name}} = $attr; return [$tag => $attr]; } }, tcf => sub {delete $parser->{pad}; return $_[0] => $_[1]}, ] ); $parser->filter(\*DATA); __DATA__ <?xml version="1.0" ?> <tcf> <tweak name = "T1"> <description>D1</description> </tweak> <tweak name = "T2"> <description>D2</description> </tweak> <tweak name = "T3"> <description>D3</description> </tweak> <tweak name = "T2"> <description>This should overwrite the old T2.</description> </tweak> </tcf>
What the code does is it builds a datastructure with the contents of the <tcf> tag (the outmost tag that has a subroutine rule) storing most tags "literaly" (whatever that means) and doing something special for the <tweak> tags. For each such tag (after it's fully parsed and the inner tags are processed according to the rules) it checks whether there is a backreference in the $parser->{pad} hash to a previous instance of <tweak> with the same name. If there is it replaces the contents and attributes of that tag and returns nothing. If there is no such backreference it creates one and returns an arrayref containing the tagname and the hash with attributes and content (this causes this to be added into the _content of the parent tag and later be written into the resulting XML.

I hope the explanation makes some sense :-) The filter mode of XML::Rules and the way the built datastructures are serialized to XML is a bit hard to explain.

Replies are listed 'Best First'.
Re^2: XML::Twig replace method behaving counter-intuitively
by Human (Initiate) on Dec 01, 2007 at 04:55 UTC
    Thanks for replying, Jenda! I haven't tried XML::Rules before, but the technical reason why I switched to XML::Twig from XML::Simple was that XML::Twig would preserve the order of entries. (I later learned it'd also do cool stuff like DTD processing and a few other things I don't need right away.) Since XML::Simple was hash-based, there's no guarantee that ordering is preserved. From a glance at the XML::Rules page on CPAN, it looks like it may use hashes, too. Is that the case? If so, then I may be unable to use such a solution. It's also possible that I'm missing some way to solve the ordering problem.

      If you try that code you will see that it does preserve the ordering. While XML::Rules does use hashes most of the time, it' really up to you what data do you need to preserve from what tag and how. Using that code the <tweak> tags' data end up in the array referenced by $_[1]->{_content} within the rule specified for the <tcf> tag. How are the data from a tag available within the $attr hashref of the parent tag depends on the tag's rule.

        Ok, so it looks like ordering would not be a problem in some cases. What about DTD support and ENTITY references? Do they work? And does the toXML method preserve ordering? I appreciate your replies, but I was also hoping to find out why my usage of XML::Twig was not working. Any ideas about that?