in reply to Re^2: Never touch (or look at) the UTF8 flag!!
in thread Interventionist Unicode Behaviors
If this piece of code was in a module, and the main script disagreed about the encoding, things would have broken. For example, if "use encoding 'iso-8859-1';" was used: you'd get a UTF8 string of three characters, but six bytes.
That's why you should never use \x for literal bytes. Instead, use pack with a "C*" template. Only if nothing in your script uses the new stuff, you can be sure you get the old stuff.This is really a side issue, because as I've stressed, the hex notation was a means to an end. All I wanted was a scalar with a particular sequence of bytes in the PV, and I'd have been just as happy to have gotten it with pack, as you advocate.
Nevertheless, I have not yet found a way to make the interpolated backslash-x notation misbehave as you suggest it should. Can you please indicate how to modify this code sample so that it illustrates your assertion?
That's clearly broken, but only because Unicode code point 0x263a doesn't map to Latin-1. How do I get the 6-byte combo?slothbear:~/perltest marvin$ cat BackslashX.pm package BackslashX; use strict; use warnings; use Encode '_utf8_on'; our $smiley = "\xE2\x98\xBA"; _utf8_on($smiley); 1; slothbear:~/perltest marvin$ cat backslash_x.plx #!/usr/bin/perl use strict; use warnings; use encoding 'iso-8859-1'; use BackslashX; use Devel::Peek; Dump($BackslashX::smiley); print $BackslashX::smiley; print "\n"; slothbear:~/perltest marvin$ perl backslash_x.plx SV = PV(0x1834224) at 0x181ed98 REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x372010 "\342\230\272"\0 [UTF8 "\x{263a}"] CUR = 3 LEN = 4 "\x{263a}" does not map to iso-8859-1 at backslash_x.plx line 11. \x{263a}
For someone who's sufficiently skilled in Perl, unicode, and the combination of both, you managed to appear quite clueless in the OP. But now I wonder if you were actually serious (if so, please rephrase your question, this time based on the way you SHOULD use things), or just trolling.
Trolling? On the contrary: I'm doing my best to keep this discussion low-key despite some rather provocative remarks about my competence that have gone by. :)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Never touch (or look at) the UTF8 flag!!
by Juerd (Abbot) on Sep 11, 2006 at 22:11 UTC |