in reply to Re: Unicode2ascii
in thread Unicode2ascii

That's a nice link - could you please tell me - if I save a file in notepad with the encoding "Unicode" - which code is this then? I ask, because there is also an encoding called "utf-8" and there is a big difference between those two. The files that I would like to open and convert back to ansi are all saved as unicode. I hope you can help

Replies are listed 'Best First'.
Re^3: Unicode2ascii
by shmem (Chancellor) on Nov 28, 2006 at 14:33 UTC
    jbert has provided a good link.. ;-) from a quick glance I guess notepad's Unicode means UTF-16LE.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      It's UCS-2LE, the fixed-width variant of UTF-16LE.
      use strict; use warnings; my $file_in = '...'; my $file_out = '...'; open(my $fh_in, '<:raw:encoding(UCS-2LE)', $file_in) or die("Unable to open \"$file_in\": $!\n"); open(my $fh_out, '>:raw:encoding(UCS-2LE)', $file_out) or die("Unable to create file \"$file_out\": $!\n"); while (<$fh_in>) { ... print $fh_out $_; }

      Update: Oops, originally confirmed that it was UTF-16LE.

        Glad you corrected this - I'm not that proficient on Windows ;-)

        I wonder about the leading sequence 0xff 0xfe in notepad saved text files - is that some marker indicating the encoding type?

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}