in reply to Unicode Pack/Unpack Woes

A piece of code to demo:
  1. the right way to use pack, unpack with unicode
  2. how to form unicode string with \x{} (I am wondering whether you are dealing with some XML stuff, as you used that &#; syntax. Just in case, you are not expecting Perl itself to understand the &#; syntax, are you?)
use strict; sub display { my $string = shift; use utf8;# as you can see from the result, whether to use utf8, or + bytes is irrelevant in this demo, as "U*' forces unicode any way print "\nchar semantics: "; print "$string "; printf "Length = %d, ", length($string); printf "Content = %vd\n", $string; use bytes; print "byte semantics: "; print "$string "; printf "Length = %d, ", length($string); printf "Content = %vd\n", $string; } my $encoded_string; my @decoded_list; { use bytes; print "=========================\n"; print "Case 1: create string from pack, with use bytes\n"; $encoded_string = pack("U*", 400, 306); display $encoded_string; @decoded_list = unpack("U*", $encoded_string); print join(".", @decoded_list), "\n"; } { use utf8; #not necessary in this case print "=========================\n"; print "Case 2: create string from pack, with use buyes\n"; $encoded_string = pack("U*", 400, 306); display $encoded_string; @decoded_list = unpack("U*", $encoded_string); print join(".", @decoded_list), "\n"; } { print "=========================\n"; print "Case 3: create string from \\x{}\n"; $encoded_string = "\x{190}\x{132}";#hex value of 400 and 306 display $encoded_string; @decoded_list = unpack("U*", $encoded_string); print join(".", @decoded_list), "\n"; }

Replies are listed 'Best First'.
Re: Re: (Decimal Char Values) Unicode Pack/Unpack Woes
by The Ninja K (Novice) on Jan 12, 2003 at 08:51 UTC
    Quick response to part of that response, thanks for the demo code.
    the &#####; comes from perlmonks.org substituting the japanese text I had thrown in as an example html presentation of it:)
      What I want to know is how perlmonks did exactly that (converted UTF-8 to &#nnn; format (in case they played with that, it's amp, pound, digits, semi)), since that's exactly what I need to do!

      Sad thing is, I know I did it once! And if you can't help me directly, perhaps you can remember how I did it last time? ;) thx.