in reply to Re: simple file question, extra spaces, win32
in thread simple file question, extra spaces, win32

In the perl 5.6 series, you had to be more explicit about your use of Unicode. In the 5.8 series, perl did a better job of detecting Unicode automagically, so a use utf8; should only be necessary in very specific circumstances.

----
I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer

: () { :|:& };:

Note: All code is untested, unless otherwise stated

Replies are listed 'Best First'.
Re: Re: Re: simple file question, extra spaces, win32
by bart (Canon) on Dec 29, 2003 at 22:08 UTC
    Except this kind of file is not in UTF-8... Instead it's just two bytes per character, most likely in Little Endian form. I think the official name of this encoding is UCS-2.

    At worst, this can converted to UTF-8 by doing:

    $utf8 = pack 'U*', unpack 'v*', $unicode;
    Not fast, but it'll do the trick. If all you want is ISO-Latin-1, try
    $latin1 = pack 'C*', unpack 'v*', $unicode;

    p.s. Those aren't spaces, instead, most of them extra bytes will be chr(0).