in reply to Re: Interventionist Unicode Behaviors
in thread Interventionist Unicode Behaviors
Huh? In your snippet, perl is "just printing whatever gets thrown at it", without doing any sort of "translation" on it.
It's trying and failing to convert Unicode code point 0x263a to Latin-1. We see the warning because it's impossible to translate a code point that high to Latin 1.
I thought the example I gave was the easiest to grok, but this is probably better, because the output is actually different.
#!/usr/bin/perl use strict; use warnings; use Encode qw( _utf8_on ); my $resume = "r\xc3\xa9sum\xc3\xa9"; print $resume, "\n"; _utf8_on($resume); print $resume, "\n";
Conceptually, appending a non-UTF8 string to a UTF8 string is a really bad idea, bordering on stupid. Don't do that. (Why would you want to? What would you hope to accomplish as a result?)
I'd like to spit out scalars flagged as UTF8 by default from KinoSearch. But if I do that, that means anybody who gets that output is going to have to know how to deal with them. I don't want to spend all my time explaining the bottomless intricacies of Unicode handling in Perl to people. It's not that I want to be doing a lot of this concatenation, it's that I know it's going to happen some of the time and I don't want the support burden.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Interventionist Unicode Behaviors
by Juerd (Abbot) on Sep 08, 2006 at 09:47 UTC | |
by creamygoodness (Curate) on Sep 08, 2006 at 12:29 UTC | |
by Juerd (Abbot) on Sep 11, 2006 at 09:38 UTC | |
by creamygoodness (Curate) on Sep 11, 2006 at 20:37 UTC | |
by Juerd (Abbot) on Sep 11, 2006 at 22:15 UTC | |
|
Re^3: Interventionist Unicode Behaviors
by graff (Chancellor) on Sep 08, 2006 at 10:25 UTC | |
by Juerd (Abbot) on Sep 08, 2006 at 10:53 UTC | |
by DrHyde (Prior) on Sep 14, 2006 at 10:09 UTC | |
by Juerd (Abbot) on Sep 15, 2006 at 02:34 UTC | |
by ysth (Canon) on Sep 10, 2006 at 07:15 UTC |