in reply to CR-LF on UTF-16LE files on Windows

A :crlf layer is automatically added on Windows. :crlf converts 0D 0A into 0A on read, and it converts 0A into 0D 0A on write.

:crlf is unfortunately added before the explicitly-specified layers, so it's performing the conversion on the encoded strings when it should be performed on the decoded strings. For ASCII-based encodings (e.g. UTF-8), this isn't a problem. But for UTF-16le, this does the wrong thing.

You can address the problem by using the following:

open my $IN, "<:raw:encoding(UTF-16LE):crlf", "in.xml"; open my $OUT, ">:raw:encoding(UTF-16LE):crlf", "out.xml";

:raw prevents the :crlf layer from being added in the first place, and then we add it explicitly on the right side* of the :encoding layer.

(Note that I also replaced the needless use of global variables with the use of lexically-scoped variables.)


* — Pun intended.

Replies are listed 'Best First'.
Re^2: CR-LF on UTF-16LE files on Windows
by vitoco (Hermit) on Nov 07, 2018 at 18:43 UTC

    Ok, that worked as expected. I need to read more about layers to fully understand this.

    Thanks!

      :crlf converts 0D 0A into 0A on read, and it converts 0A into 0D 0A on write. This was being done to the encoded strings when it should have been done to the decoded strings.

      (My earlier post has been edited to integrate this.)

        Would binmode() work? I don't have any files like that at my disposal to test.