in reply to Text:Unidecode question

#!/usr/bin/perl -wl use strict; use HTML::Entities; use Text::Unidecode; my $text = "advice to Gwenda to “let sleeping murder lie.” +"; print unidecode( decode_entities($text) ); # advice to Gwenda to "le +t sleeping murder lie."

(i.e., decode_entities() returns a Unicode string, which you then pass to unidecode() to transliterate to ASCII)

Alternatively, with home-brewn decoding (--> chr is the function you were looking for):

#!/usr/bin/perl -wl use strict; use Text::Unidecode; my $text = "advice to Gwenda to “let sleeping murder lie.” +"; $text =~ s/&#(\d+);/chr $1/ge; print unidecode($text); # advice to Gwenda to "le +t sleeping murder lie."

Replies are listed 'Best First'.
Re^2: Text:Unidecode question
by Nodonomy (Novice) on Jan 28, 2011 at 14:44 UTC
    Yes! The first solution is all that I have tried at this point; i.e., unidecode(decode_entities($text)). It works like a charm indeed. Thank you "Anonymous Monk" specifically and Perl Monks in general. What a nice solution, thanks to those who wrote the two modules HTML::Entities and Text::Unidecode. -Node from Nodonomy