rasher has asked for the wisdom of the Perl Monks concerning the following question:

Basically, I don't understand in which way the encoding pragma modifies pack's behavior. In short, why do the following two commands not output the same value:

$ perl -Mencoding=utf8 -e 'print(pack('n', 204))' $ perl -e 'print(pack('n', 204))'
The former outputs 0x00 0xC3 0x8C, the other outputs 0x00 0xCC.

I would have expected the bahavior of n to be unchanged, and the value of 204 to be unchanged both regardless of encoding. However, one (or both) is modified by the encoding. In which way, I don't understand.

For bonus points: How can I use encoding 'utf8', and still get 0x00 0xCC?

Replies are listed 'Best First'.
Re: use encoding affects pack()
by betterworld (Curate) on Aug 10, 2008 at 00:09 UTC

    I don't think the behaviour of pack (here) is affected by the encoding pragma. "n" means an unsigned short in network order, which is unrelated to strings (except that pack returns a string).

    However, the encoding pragma does affect how the returned string is printed. From perldoc encoding we know:

    The encoding pragma also modifies the filehandle layers of STDIN and STDOUT to the specified encoding.

    So, if you want to stick to the utf8 encoding pragma but don't want to output this string in utf8:

    perl -Mencoding=utf8 -e 'binmode STDOUT, ":bytes" or die $!; print(pack("n", 204))'

    or

    perl -e 'use encoding "utf8", STDOUT => undef; print(pack("n", 204))'
      That did it! Thanks a lot. It's obvious, now that I think about it, but that's usually the case isn't it.

      My full problem got slightly more complicated by the fact that I also had to output some actual UTF-8 at the same time, but using encode("utf8", $string) gave me what I needed.

Re: use encoding affects pack()
by ikegami (Patriarch) on Aug 10, 2008 at 00:27 UTC

    encoding's stated purpose is to handle scripts written in encodings other than iso-latin-1, but as explained already, it affects other things as well.

    Due to other problems with the module (but possibly including this problem), use of encoding is seen as buggy by Perl's authors.

    Script's encoding:

    • If you wish to execute scripts encoded as UTF-8, use utf8. (e.g. use utf8;)
    • Your only other option is really iso-latin-1, the default.

    Output encoding:

    • If you wish to output characters appropriate to your locale, use open. (e.g. use open qw( :std :locale );)
    • If you wish to output binary data, use open. (e.g. use open qw( :std :bytes );)

    Sounds like you need use open qw( :std :bytes );.
    If your script is UTF-8 encoded, you'll need use utf8; as well.

Re: use encoding affects pack()
by tinita (Parson) on Aug 10, 2008 at 17:31 UTC
    try to switch to use utf8
    encoding.pm can influence other modules and break their code, for example the code of Image::Size which is loaded by AutoLoader (see use encoding 'utf8' and AutoLoader for an example)