Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a file contains single line (actually i have multilines in original file), its like:
Thomas Ack\x26#39\x3Bhing
The file was generated by WWW::Mechanize, and I want to decode the string above, so i made a script :
#!/usr/bin/perl use strict; use warnings; use HTML::Entities; open my $fh, '<', 'source.html' or die $!; my $string = do { local $/; <$fh> }; print decode_entities( $string);
And failed, not sure why

But when i tried this code (put the string inside the script)

#!/usr/bin/perl use strict; use warnings; use HTML::Entities; my $string = "Thomas Ack\x26#39\x3Bhing"; print decode_entities( $string); #Output: Thomas Ack'hing

So anyone have a solution for the problem ?

Replies are listed 'Best First'.
Re: decode characters from file using HTML::Entities
by ikegami (Patriarch) on Feb 18, 2011 at 16:18 UTC

    There are no HTML entities in that file. HTML entities start with "&", but there are no "&" in that file. Where one would expect a "&", there's "\","x","2","6".

    The second snippet decodes an entirely different string. Whereas you decode «Thomas Ack\x26#39\x3Bhing» in the first snippet, you construct and decode «Thomas Ack&#39;hing» in the second snippet.

      Ah I understand (maybe :)),

      i think i forgot about double quotes (so \x26 is actually "&"), correct ?

      So i guess the right tool is to use Encode..

        Yes, the Perl string literal «"\x26"» evaluates to the string «&».

        No idea what Encode has to do with this.