in reply to Re^2: Read and write UTF-8
in thread Read and write UTF-8
Note that the hexdump of your input data has a Byte Order Mark (BOM) at the front of it, which Perl counts at least as some characters.
Discounting the BOM, I get the expected output with the following program:
#!/usr/bin/perl -w use strict; use Encode qw/encode decode/; open (INFILE, "<:encoding(UTF-8)", "utf8.txt") || die "blah blah blah" +; open (OUTFILE, ">:encoding(UTF-8)", "oututf8.txt") || die "blah blah"; binmode STDOUT, ':encoding(UTF-8)'; print "Ruler : [12345678901234567890]\n"; while (my $line = <INFILE>) { chomp ($line); print "Input : [$line]\n"; my $linestart = substr($line,0,20); my $outline = $linestart; print "20 : [$outline]\n"; print "---\n"; print OUTFILE "$outline\n"; } close (INFILE);
To remove the BOM at the start of your file, use maybe simply
$line =~ s!^\N{BYTE ORDER MARK}!!;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Read and write UTF-8
by Anonymous Monk on Oct 17, 2016 at 17:15 UTC | |
by Corion (Patriarch) on Oct 17, 2016 at 17:28 UTC |