Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I am using XML::Parser with tree style. however

<tag1> <tag2 attr1 = "attrval" attr2 = "attrval"> </tag2> </tag1>

causes entries with "0" followed by "\n" because endtags are in the next line, along with other regular entries. like "tag" and \@array or "0" "text" or \%hash. Is there any way (option to XML::Parser) to avoid entries with "0" "\n"?

Thanks

Saumil

Edited by mirod:added tags

Replies are listed 'Best First'.
Re: XML::Parser and white spaces
by mirod (Canon) on Aug 01, 2002 at 23:43 UTC

    There is no way with XML::Parser. Most XML modules will include whitespace in the data they create, as there is no way for the parser to know whether it's significant or not (whitespace in pre tags is significant in XHTML for example).

    You can deal with this by using XPath to navigate the XML structure (with XML::XPath or XML::libXML), some "Perlish" modules, like XML::Simple or XML::Twig will also remove the whitespaces.

Re: XML::Parser and white spaces
by Anonymous Monk on Aug 02, 2002 at 06:43 UTC
    #!/usr/bin/perl -w use strict; use XML::Parser; use Hook::LexWrap; wrap XML::Parser::Tree::Char, pre => sub { # pre-empt Tree's Char handler unless # string is nonblank $_[-1] = 1 unless ($_[1] =~ /\S/); }; my $xml = $ARGV[0] || die "usage: mytree.pl file.xml"; my $parser = XML::Parser->new(Style => 'Tree'); my $tree = $parser->parsefile($xml);
    chocolateboy