in reply to XML::Twig Text replacement

XML::Twig:

use strict; use warnings; use XML::Twig qw( ); binmode STDOUT; my $t = XML::Twig->new( twig_handlers => { '/Profile/Application' => sub { my $Id = $_->att('Id'); my $CsID = (split(/\//, $Id))[-1]; $_->set_att(Id => $CsID); }, }, ); $t->parsefile($ARGV[0]); $t->flush();

XML::LibXML:

use strict; use warnings; use XML::LibXML qw( ); use XML::LibXML::XPathContext qw( ); my $doc = XML::LibXML->new()->parse_file($ARGV[0]); my $root = $doc->documentElement(); my $xpc = XML::LibXML::XPathContext->new(); $xpc->registerNs(x => 'xxxxxxxxx'); for ($xpc->findnodes('/x:Profile/x:Application', $root)) { my $Id = $_->getAttribute('Id'); my $CsID = (split(/\//, $Id))[-1]; $_->setAttribute(Id => $CsID); } binmode STDOUT; print $doc->toString();

XML::LibXML is a bit wordier than XML::Twig (the 2 xpc lines) in order to handle namespaces correctly. (XML::Twig doesn't.)

Replies are listed 'Best First'.
Re^2: XML::Twig Text replacement
by mirod (Canon) on May 01, 2010 at 21:13 UTC

    Actually you can handle namespaces in XML::Twig, using the map_xmlns option. I am not sure it's worth doing in this case though (and it might be a good example of why I dislike seemingly gratuitous default namespaces, they just make processing harder while providing exactly 0 added value).

    Also, if you use the id => 'Id' option in the new, you can then write $Id= $_->id and $_->set_id( $CsID); which I think is slighty clearer, and has the added benefit, if need be, to let you access an element directly through its id, using the elt_id method.

      Ah good. I don't use XML::Twig, so I don't have a deep knowledge of it.

      If it consistently ignored namespaces when map_xmlns isn't used, it would be a great shortcut despite being non-standard since there is rarely need to deal with namespace conflicts. (The module never claimed them to be a real XPaths.) Unfortunately, it doesn't consistently ignore namespaces.

      On the plus side, it works according to standard when map_xmlns is used. (Well, I'm not sure how namespaces interact with attributes, so I'm simply commenting on elements.)

      use strict; use warnings; use XML::Twig qw( ); my $xml = <<'__EOI__'; <?xml version="1.0" encoding="UTF-8" standalone="no" ?> <root xmlns:foo="uri:foo"> <ele id="a" /> <ele id="b" xmlns="uri:foo"/> <foo:ele id="c" /> </root> __EOI__ { my $seen = ''; my $t = XML::Twig->new( twig_handlers => { 'ele' => sub { $seen .= $_->att('id') }, }, ); $t->parsestring($xml); print("$seen\n"); print($seen eq 'a' ? "Standard\n" : "Not standa +rd\n"); print($seen eq 'a' || $seen eq 'abc' ? "Consistent\n" : "Not consis +tent\n"); } print("\n"); { my $seen_null = ''; my $seen_foo = ''; my $t = XML::Twig->new( map_xmlns => { 'uri:foo' => 'f', }, twig_handlers => { 'ele' => sub { $seen_null .= $_->att('id') || $_->att('f: +id') }, 'f:ele' => sub { $seen_foo .= $_->att('id') || $_->att('f: +id') }, }, ); $t->parsestring($xml); print("$seen_null:$seen_foo\n"); print($seen_null eq 'a' ? "Standard\n" : " +Not standard\n"); print($seen_null eq 'a' || $seen_null eq 'abc' ? "Consistent\n" : " +Not consistent\n"); print($seen_foo eq 'bc' ? "NS working\n" : " +NS broken\n"); }
      ab Not standard Not consistent a:bc Standard Consistent NS working

        There are actually 3 ways of dealing with namespaces, 2 of which are supported by XML::Twig:

        • proper support, your second example, which is correct, but verbose as you have to associated namespace URIs to prefixes; I also suspect that it is not as robust as one would expect, I don't quite trust those URIs to not change sneakily, especially for the default namespace,
        • ignore namespace declarations is XML::Twig's default mode, there foo:ele is the element by that name, and ele is just ele, whether it is assigned a default namespace or not, you think of this as inconsistent,
        • drop all namespaces, which is your first example, there foo:ele is seen as ele; I had never thought of that option, I could add it, but I strongly suspect that it would be one favored only by users who care about namespaces, and already use map_xmlns (or XML::LibXML!),

        For XML::Twig, consistency is not as important as convenience, and I the current behaviour seems to be convenient for most users. I'll look into adding an option to just drop namespaces, it should not be too difficult, but I am not sure how useful it would be.

Re^2: XML::Twig Text replacement
by Gizmo (Novice) on May 01, 2010 at 12:16 UTC
    Thanks a lot, makes sense now. I'll try out the libXML too.