in reply to XML::Twig and whitespaces

In attributes, the behaviour shown is normal, actually, it is required by the XML spec, see Attribute Value Normalization.

I don't think the module does that in elements, except that it discards line returns followed by spaces between 2 tags (getting rid of non-significant whitespaces, as far as it can tell). you can turn this off using the keep_spaces option when you create the twig.

You could not-normalize attribute values by using the keep_encoding method and writing your own start tag parser (based on XML::Twig's own parser in _parse_start_tag) and using it through the parse_start_tag option. Not really simple, but you are trying to do non XML processing with an XML processor here.

Replies are listed 'Best First'.
Re^2: XML::Twig and whitespaces
by DJpumps (Novice) on Aug 25, 2007 at 03:38 UTC
    Hello, midod, and thanks for your quick reply. While you are right (and I was wrong) with regards to Attribute Value Normalization (see the correct updated reference from XML 1.0 4th edition at http://www.w3.org/TR/REC-xml/#AVNormalize), this behavior should not apply to element value. However, it does when using XML::Twig. I want to be able to read element values "as is" without applying any manipulation to these values, at least whitespace wise. So is there a way and if so what is the way of perserving whitespaces? Thanks.
    -- DJpumps

      Did you try "using the keep_spaces option when you create the twig" as indicated in my previous answer? Did it not do what you want?