in reply to Re^3: Character encoding of microns
in thread Character encoding of microns
If i run your code, with the micron encoded as \x{C2}\x{B5} then just using decode('utf8',$clob) seems to work. As you can see from the first set of clob/conv strings below, after the bytes stuff.
However if i actually type a micron into the string using Alt-0181 then i get the following output: note i turned use diagnostics on.clob: 74:68:69:73:20:69:73:20:73:74:72:69:6E:67:20:77:69:74:68:20:C2:B +5:20:69:6E:20:69:74 -- byte conv: 74:68:69:73:20:69:73:20:73:74:72:69:6E:67:20:77:69:74:68:20:C2:B +5:20:69:6E:20:69:74 -- utf8 unix perlio clob: 'this is string with µ in it' conv: 'this is string with µ in it' unix perlio encoding(utf8) utf8 clob: 'this is string with õ in it' conv: 'this is string with µ in it'
clob: 74:68:69:73:20:69:73:20:73:74:72:69:6E:67:20:77:69:74:68:20:B5: +20:69:6E:20:69:74 -- byte conv: 74:68:69:73:20:69:73:20:73:74:72:69:6E:67:20:77:69:74:68:20:EF:B +F:BD:20:69:6E:20:69:74 -- utf8 unix perlio clob: 'this is string with µ in it' Wide character in print at 742047.pl line 19 (#1) (W utf8) Perl met a wide character (>255) when it wasn't expecting one. This warning is by default on for I/O (like print). The eas +iest way to quiet this warning is simply to add the :utf8 layer to the output, e.g. binmode STDOUT, ':utf8'. Another way to turn off the warning is to add no warnings 'utf8'; but that is often closer to cheating. In general, you are supposed to explicitly mark the filehandle with an encoding, see open and perlfunc/binmode. conv: 'this is string with � in it' unix perlio encoding(utf8) utf8 clob: 'this is string with µ in it' conv: 'this is string with � in it'
That last conv string is i assume your splodge? Perhaps then as no question marks are being output, this is not an encoding problem at all?
I honestly do appreciate all your time
Joe.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: Character encoding of microns
by almut (Canon) on Feb 10, 2009 at 14:56 UTC | |
by joec_ (Scribe) on Feb 12, 2009 at 09:27 UTC | |
by graff (Chancellor) on Feb 13, 2009 at 06:10 UTC |