in reply to XML::Simple bug? aka I want the whitespace dude!

What you're seeing is definitely by design.

I originally wrote XML::Simple specifically for reading (and later writing) config files in XML format. It proved to be useful for other simple XML tasks too. However it was never intended to be 'the one and only Perl module you'll ever need for working with XML'.

Personally, for most tasks that involve reading XML, I tend to use XML::LibXML and often it requires less code than XML::Simple would have - even for simple things (yay XPath!). For writing XML, I tend to use the Template Toolkit or HTML::Mason. More advice in the Perl XML FAQ

  • Comment on Re: XML::Simple bug? aka I want the whitespace dude!

Replies are listed 'Best First'.
Re^2: XML::Simple bug? aka I want the whitespace dude!
by Jenda (Abbot) on Aug 23, 2005 at 17:20 UTC

    OK and would it hurt anything to preserve the whitespace in case of tags with no children? Of course you would not want to keep the whitespace for <foo>

    <foo> <bar>x</bar> <baz>y</baz> </foo>
    that would make a fairly big difference in the results but for <foo> </foo>? The only difference is that you get  ..., foo => ' ', ... instead of  ..., foo => '', ... which would actually make it consistent with the handling of <foo> whitespace preserved  </foo>. What I have IS basically a config file, but I need to preserve the whitespace, even if it's the only content of an option.

    The whole change necessary in the module would be

    line 925 << next if($val =~ m{^\s*$}s); # Skip all whitespace content >> next if (($self->{opt}->{suppressempty} or %$attr) and $val =~ + m{^\s*$}s); # Skip all whitespace content line 956 >> if (!$self->{opt}->{suppressempty} and scalar(keys %$attr) > 1 a +nd $attr->{$self->{opt}->{contentkey}} =~ m{^\s*$}s) { >> delete $attr->{$self->{opt}->{contentkey}}; >> }

    Jenda
    XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.

      would it hurt anything to preserve the whitespace in case of tags with no children?

      This is going to sound rude and uncaring (which is unfortunate because I try not to be either) but ... Yes, it would hurt.

      You're talking about changing the default behaviour. In a module that's been around as long as this one has then that is certain to break a lot of scripts. Just by way of example, it broke 21 tests in the test suite that ships with the module. It also introduced almost 2000 warning messages during make test.

      Now obviously it would be possible to clean up the warnings and add another option so the default behaviour was not affected, but that's not really going to fly either. XML::Simple already has far too many options. The claim to the name 'Simple' was lost years ago. I regularly reject requests to add 'one simple option' because I don't want to make matters worse.

      The reality is that XML::LibXML is a powerful and flexible module that can do what you want. You might want to put your own thin wrapper around it to simplify the things that you want to do regularly. In the end though, it will be a better solution because it will work the way you expect it to work and it won't bogged down with options to make it work in the weird and wonderful ways other people expect.

      Sorry if I sound grumpy.

        Thanks for your comments. I should have tried the tests, sorry. If by the 2000 warning messages you mean

        Use of uninitialized value in concatenation (.) or string at D:/Perl/s +ite/lib/XML/SAX/Expat.pm line 198.
        then I do get those with the original version as well! XML::SAX::Expat ver. 0.35.

        In this case I don't think there is a need for a new option, at most new values for NormaliseSpace. Let's say -1 = keep the whitespace if there are no subtags and -2 = keep all whitespace.

        #against original version line 925 << next if($val =~ m{^\s*$}s); # Skip all whitespace content >> next if ($self->{opt}->{normalisespace} >= 0 and $val =~ m{^\s +*$}s); # Skip all whitespace content line 992 >> if ($self->{opt}->{normalisespace} == -1 and ref($attr->{$self->{o +pt}->{contentkey}}) eq 'ARRAY' >> and !grep( !m{^\s*$}s, @{$attr->{$self->{opt}->{contentkey}}})) { >> delete $attr->{$self->{opt}->{contentkey}}; >> }
        All tests pass this time.

        It would be great if this patch was accepted, otherwise I'll just have to use the tweaked XML::Simple as the "thin wrapper", instead of wasting time writing my own :-)

        Jenda
        XML sucks. Badly. SOAP on the other hand is the most powerfull vacuum pump ever invented.