in reply to Re: Re: simple file question, extra spaces, win32
in thread simple file question, extra spaces, win32

Except this kind of file is not in UTF-8... Instead it's just two bytes per character, most likely in Little Endian form. I think the official name of this encoding is UCS-2.

At worst, this can converted to UTF-8 by doing:

$utf8 = pack 'U*', unpack 'v*', $unicode;
Not fast, but it'll do the trick. If all you want is ISO-Latin-1, try
$latin1 = pack 'C*', unpack 'v*', $unicode;

p.s. Those aren't spaces, instead, most of them extra bytes will be chr(0).