Jim has asked for the wisdom of the Perl Monks concerning the following question:
First, I want to convert this hexadecimal string representation of a Unicode codepoint to an integer. Easy enough:my $unicode_character_hexadecimal_string = '0x20ac';
Right?my $unicode_codepoint_integer = eval $unicode_character_hexadecimal_string;
From here, I want to get to some hexadecimal string representation of the UTF-8 encoding of this Unicode codepoint:
or perhaps simply'0xE2 0x82 0xAC'
Then, as strange as it seems, I want to get to an analagous hexadecimal string representation of the same sequence of bytes with the most significant bit turned off (i.e, with 0x80 substracted):'E2 82 AC'
or'0x62 0x02 0x2C'
Oh, and along the way, I also want to get binary string representations of the same values:'62 02 2C'
and'11100010 10000010 10101100'
Finally, I want to print the Unicode (UTF-8) characters alongside these various string representations.'01100010 00000010 00101100'
I'm trying to generate a kind of metavalue table. You have to trust me: I really do want to do exactly what I've outlined above. I'm using Perl 5.8.8 (ActivePerl build 822).
Thanks.
Jim
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Need Help With Seemingly Bizarre Unicode Task
by graff (Chancellor) on Dec 30, 2007 at 06:21 UTC | |
|
Re: Need Help With Seemingly Bizarre Unicode Task
by ikegami (Patriarch) on Dec 30, 2007 at 07:23 UTC | |
by Jim (Curate) on Dec 31, 2007 at 23:33 UTC |