http://qs1969.pair.com?node_id=951131


in reply to Character encoding woes - unicode or not?

use Encode qw(decode); sub decode_it { my $s = shift; eval { $s = decode('UTF-8', $s, 1); 1; } or do { $s = decode('latin1', $s, 1); }; return $s; } use Devel::Peek qw(Dump); Dump decode_it($_) for "\xE2\x84\xA2", "\xE7\xE1"; __END__ SV = PV(0x100820708) at 0x10081c248 REFCNT = 1 FLAGS = (TEMP,POK,pPOK,UTF8) PV = 0x100202140 "\342\204\242"\0 [UTF8 "\x{2122}"] CUR = 3 LEN = 8 SV = PV(0x100820748) at 0x100860ee0 REFCNT = 1 FLAGS = (TEMP,POK,pPOK,UTF8) PV = 0x10025dec0 "\303\247\303\241"\0 [UTF8 "\x{e7}\x{e1}"] CUR = 4 LEN = 8

Once decoded by decode_it, your character strings are ready to be UTF-8 encoded right before you put it out onto your web page.