mitbeaver has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Monks, The following is confusing me greatly with a project where I am writing UTF-8 encoded files (Japanese).

I've come down to this being an issue.
print unpack( "H*", "\x{2020}");
I would expect this to print 2020.

Is there some issue with wide hex characters and pack?

Thoughts?

Replies are listed 'Best First'.
Re: Unpacking Wide Hex Characters
by ikegami (Patriarch) on Mar 19, 2009 at 04:36 UTC

    unpack primarily deals with bytes, and H is no exception. Getting the hex representation of the byte "\x{2020}" makes no sense.

    ord gets the UNICODE codepoint and sprintf can convert to hex.

    sprintf('%x', ord("\x{2020}"))

    You can do some nifty stuff with Encode, too.

    encode('US-ASCII', "ab\x{2020}cd", Encode::FB_PERLQQ) # ab\x{2020}cd

    See other values for the third param.