Hi wizards, I seek your wisdom.
I am running Active Perl 5.14 on Windows 7.
I am trying to write a program that will read-in a conversion table, then work on a file and replace certain patterns by other patterns - all of the above in Unicode (UTF-8).
Here is the beginning of the program:
#!/usr/local/bin/perl
# Load a conversion table from CONVTABLE to %ConvTable.
# Then find matches in a file and convert them.
use strict;
use warnings;
use Encode;
use 5.014;
use utf8;
use autodie;
use warnings qw< FATAL utf8 >;
use open qw< :std :utf8 >;
use charnames qw< :full >;
use feature qw< unicode_strings >;
my ($i,$j,$InputFile, $OutputFile,$word,$from,$to,$linetoprint);
my (@line, @lineout);
my %ConvTable; # Conversion hash
print 'Conversion table: opening file: E:\My Documents\Perl\Conversio
+n table.txt'."\n";
my $sta= open (CONVTABLE, "<:encoding(utf8)", 'E:\My Documents\Perl\C
+onversion table.txt');
binmode STDOUT, ':utf8'; # output should be in UTF-8
# Load conversion hash
while (<CONVTABLE>) {
chomp;
print "$_\n"; # etc ...
# etc ...
It turns out that at this point, it says:
wide character in print at (eval 155)E:/Active Perl/lib/Perl5DB.pl:640
+]line 2, <CONVTABLE> line 1, etc...
Why is that? I think I've gone through all the necessary prescriptions for correct handling of Unicode, decoding and encoding into UTF-8?
And how to fix it?
TIA
Helen
Note: I may cross-post on StackOverflow
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.