desemondo has asked for the wisdom of the Perl Monks concerning the following question:
I am having trouble with getting text (CRLF specifically) to encode correctly into UTF-16 little endian. Essentially I am expecting this output below:
~~~ Human readable output of what is being generated ~~~~~~~~~~~~ Line1 Line2 Line4 ~~~~~ Actual Results ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4C 00 69 00 6E 00 65 00 31 00 0D 0A 00 4C 00 69 00 6E 00 65 00 32 00 0D 0A 00 0D 0A 00 4C 00 69 00 6E 00 65 00 34 00 0D 0A 00 ~~What was expected and is required for valid UTF-16LE encoding ~~~ 4C 00 69 00 6E 00 65 00 31 00 0D 00 0A 00 ^ byte missing from actual results 4C 00 69 00 6E 00 65 00 32 00 0D 00 0A 00 0D 00 0A 00 ^ byte missing from actual results ^ byte missing from actual results 4C 00 69 00 6E 00 65 00 34 00 0D 00 0A 00 ^ byte missing from actual results ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I suspect this issue (or bug in Encode.pm?) may be due to \n being mappped to CRLF on windows whereas in *nix its just LF and Encode.pm and it's dependancies aren't handling that correctly.
I have tried numerous things, eg. using BE, UCS-2LE/BE, using \015\012 instead of \n - all seem to have the same issue.
use strict; use warnings; use Encode qw(encode decode); ### Actual Results my $string = "Line1\nLine2\n\nLine4\n"; open (my $output_fh, ">:encoding(utf-16le)", 'Test_reg.reg') || die "Unable to create reg output file. $!"; print {$output_fh} $string ; ### something else I tried, also doesn't work correctly. my $string2 = "Line1\015\012Line2\015\012\015\012Line4\015\012"; open (my $output_fh2, ">:encoding(utf-16le)", 'Test_reg2.reg') || die "Unable to create reg output file. $!"; print {$output_fh2} $string2 ;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: CRLF not encoding into UTF-16LE correctly on ActivePerl 5.8.8
by ikegami (Patriarch) on Feb 15, 2010 at 06:41 UTC | |
|
Re: CRLF not encoding into UTF-16LE correctly on ActivePerl 5.8.8
by 7stud (Deacon) on Feb 15, 2010 at 05:32 UTC | |
|
Re: CRLF not encoding into UTF-16LE correctly on ActivePerl 5.8.8
by Anonymous Monk on Feb 15, 2010 at 03:02 UTC | |
by desemondo (Hermit) on Feb 15, 2010 at 03:29 UTC |