Re^3: Reading in utf-8 txt file gives garbled data when printed as part of utf-8 html...

f\303\266\303\266\n is UTF-8 encoded.
If it's a string of chars (the UTF-8 flag is set), you'll get UTF-8 when you print to a UTF-8 filehandle.
If it's a string of octets (the UTF-8 flag is clear), you'll get UTF-8 when you print to a raw filehandle.

f\x{f6}\x{f6}\n is iso-latin-1 encoded.
When you print to a UTF-8 filehandle, Perl will assume it's iso-latin-1 and convert it to UTF-8.
When you print to a raw filehandle, you'll get those exact octets.

Comment on Re^3: Reading in utf-8 txt file gives garbled data when printed as part of utf-8 html... Select or Download Code

Replies are listed 'Best First'.
Re^4: Reading in utf-8 txt file gives garbled data when printed as part of utf-8 html... by isync (Hermit) on Aug 28, 2007 at 09:54 UTC
That made everything a lot clearer and the $Data::Dumper::Useqq switch is EXTREMELY helpful! Thanks!	[reply]
Re^5: Reading in utf-8 txt file gives garbled data when printed as part of utf-8 html... by Anonymous Monk on Apr 21, 2009 at 23:49 UTC
you can also use use utf8; you dont have to make it binmode as all strings , input and output will be considered as in perls lax utf8 interpretation.	[reply]
Re^6: Reading in utf-8 txt file gives garbled data when printed as part of utf-8 html... by ikegami (Patriarch) on Apr 22, 2009 at 00:19 UTC
`use utf8;` doesn't remove the need to binmode the handles. `use utf8; print($fh chr(0x40)); # Happens to work print($fh chr(0xC9)); # Generates broken output print($fh chr(0x2660)); # Warns binmode($fh, ':utf8'); print($fh chr(0x40)); # Ok print($fh chr(0xC9)); # Ok print($fh chr(0x2660)); # Ok` [download] All it does is let Perl know the source is encoded using UTF-8.	[reply] [d/l] [select]