John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

When I've looked into this kind of thing before, the answer was "don't even try to format your XML. Use a specialized editor to view it."

But now I have a good reason. I want to change one value in an XML file, but not confuse the version control system. When I look at the deltas in git, I want to clearly see that one value changed, not have it tell me that the whole file is totally different, or that the file only has one (enormously long) line.

I'm only changing one value in an XML file that I normally just read, and that's created by the user in-person. So is there some kind of filter approach that will let me update one value while preserving everything, even whitespace, in the file otherwise?

—John

  • Comment on Updating XML while preserving formatting

Replies are listed 'Best First'.
Re: Updating XML while preserving formatting
by Your Mother (Archbishop) on May 07, 2009 at 17:37 UTC

    This works (mostly); I found indentation on the root node breaks it but it might be right for what you're doing.

    use strict; use warnings; use XML::LibXML; use Test::More tests => 2; my $parser = XML::LibXML->new(); $parser->keep_blanks(1); my $dom_one = $parser->parse_string(xml_one()); is( $dom_one->serialize(), xml_one(), "Serialized xml_one is the same as original" ); my $dom_two = $parser->parse_string(xml_two()); is( $dom_two->serialize(), xml_two(), "Serialized xml_two is the same as original" ); sub xml_one { <<""; <?xml version="1.0"?> <stuff> <andjunk> muhuminah </andjunk> </stuff> } sub xml_two { <<""; <?xml version="1.0"?> <stuff> <andjunk> muhuminah</andjunk> </stuff> } __END__ 1..2 ok 1 - Serialized xml_one is the same as original ok 2 - Serialized xml_two is the same as original
      my $parser = XML::LibXML->new(); $parser->keep_blanks(1);
      Awesome. Another reason to wish XMLlib was available on this platform.
Re: Updating XML while preserving formatting
by mirod (Canon) on May 08, 2009 at 04:34 UTC

    You can use XML::Twig with the twig_toots and twig_print_outside_roots options:

    XML::Twig->new( twig_roots => { q{elt[@id="id1"]} => sub { $_->set_tex +t( "new_val")->print} }, twig_print_outside_roots => 1, ) ->parsefile( "my_xml")

    If you have control over the original XML, you may also want to have a look at pretty_print with the cvs option, which is especially designed to make XML friendly to line-oriented tools like source control systems.