String to UTF-8

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: String to UTF-8 by Juerd (Abbot) on Sep 25, 2004 at 13:57 UTC
Use unpack with a template of `U`, or split $str and use map with ord: `local $, = ' '; local $\ = "\n"; print unpack 'U', $str; print map ord, split //, $str;` [download] Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }	[reply] [d/l]
Re^2: String to UTF-8 by Baz (Friar) on Sep 25, 2004 at 14:03 UTC
`PrintUnicodeString($str); PrintUnicodeString{ join("", map { printf("\\x{%04X}", $_) # \x{...} } unpack("U*", $_[0])); # unpack Unicode characters }` [download]	[reply] [d/l]
Re: String to UTF-8 by ambrus (Abbot) on Sep 25, 2004 at 13:55 UTC
`printf "%*vd\n", " ", $str;` [download] Update: you might want to chose some better title for your question, as the current one (String to UTF-8) has nothing to do with the question.	[reply] [d/l]
Re: String to UTF-8 by Anonymous Monk on Sep 25, 2004 at 13:43 UTC
perluniintro - Perl Unicode introduction	[reply]
Re: String to UTF-8 by bart (Canon) on Sep 26, 2004 at 08:16 UTC
Because at least one of your character codes is 255 or higher, your string will internally be encoded in UTF-8. However, Perl's pure string manipulation routines work transparently, whether a string is in UTF-8 or in a single byte encoding, such as split, chr, ord, length. So you can achive what you want by using rather classic code — meaning it doesn't look like anything special: `for my $i (0 .. length($str)-1) { print " " if $i; print ord substr $str, $i, 1; } print "\n";` [download] or, like Juerd mentioned: `print join " ", map ord, split //, $str; print "\n";` [download] Read more... (1017 Bytes) If, OTOH, you choose to use pack/unpack, it'll work on the raw bytes, so it will make a difference whether the string is in single-bye encoding (each character is a byte), or in UTF-8. `print join " ", unpack "C*", $str; print "\n";` [download]	[reply] [d/l] [select]