in reply to XML::Twig parse html trouble

remiah:

I tried your code using XML::Twig v3.39, perl v5.14.2 and it looks like it's working properly. For the data section, I used:

<html><body> <div>M&M</div> <div>M&amp;M</div> <div>M&amp;amp;M</div> </body></html>

and the output is:

<html> <head></head> <body> <div>M&amp;M</div> <div>M&amp;M</div> <div>M&amp;amp;M</div> </body> </html>

Update: In light of the previous reply, I'm using HTML::TreeBuilder v4.2

...roboticus

When your only tool is a hammer, all problems look like your thumb.

Replies are listed 'Best First'.
Re^2: XML::Twig parse html trouble
by remiah (Hermit) on Feb 20, 2014 at 20:36 UTC

    Thanks for reply, roboticus!

    I forgot to report module version and error message.

    >perl twigtest1.pl
    
    not well-formed (invalid token) at line 2, column 8, byte 34 at C:/stra
    rl/vendor/lib/XML/Parser.pm line 187.
     at twigtest1.pl line 6.
    
    >perl -MXML::Twig -e "print $XML::Twig::VERSION;"
    3.44
    >perl -MHTML::TreeBuilder -e "print $HTML::TreeBuilder::VERSION;"
    5.03
    
    I got same error with your test data.
    This is very strange for me, cause I used XML::Twig as HTML parser for several times, and I got no such error like this. I can't think that there was no "character entity references" at that time...

    I would like to try available older version of modules.
    regards.