G'day tel2,
"I'm guessing I might need to "use utf8", ..."
Sorry, but that would be a bad guess.
The documentation for the utf8 pragma states, in emboldened text:
"Do not use this pragma for anything else than telling Perl that your script is written in UTF-8."
Your basic problem here is that the filehandle, FILE, doesn't know about the UTF-8.
Example of what's happening:
$ perl -Mutf8 -wE 'say "e-acute: é; u-acute: ú"'
e-acute: ?; u-acute: ?
Here's three ways to address this problem:
-
Use the binmode function, e.g.
$ perl -Mutf8 -wE 'binmode STDOUT => ":utf8"; say "e-acute: é; u-acute
+: ú"'
e-acute: é; u-acute: ú
-
Use the open pragma, e.g.
$ perl -Mutf8 -wE 'use open OUT => qw{:utf8 :std}; say "e-acute: é; u-
+acute: ú"'
e-acute: é; u-acute: ú
-
Use the 3-argument form of the open function
and specify the encoding in the mode. Something like this:
open my $fh, '>:encoding(UTF-8)', $filename
Here's some recommendations for your code.
This is unrelated to the UTF-8 issue.
-
Let Perl tell you about problems.
Start using the strict pragma
and the warnings pragma.
-
Your code is littered with package variables: $cgi, an object reference; $f1, a string; FILE, a filehandle; and so on.
These are all global and suffer from the same problems as all global variables.
Start using lexical variables, and control their scope, for far less error-prone code.
There's a lot of information about this in perlsub;
the "Private Variables via my()"
section would be a good place to start.
-
Don't use indirect object syntax, e.g. code like new CGI.
Here's what perlobj: Invoking Class Methods says,
in emboldened text, at the start of the Indirect Object Syntax section:
"Outside of the file handle case, use of this syntax is discouraged as it can confuse the Perl interpreter. See below for more details."
-
Start using lexical filehandles with the 3-argument form of the open function. See that document for more about this.
-
Hand-crafting I/O die messages is time-consuming and error-prone.
Let Perl do this task for you with the autodie pragma.
You can then write code like this:
use autodie;
...
open my $in_fh, '<', $infile;
open my $out_fh, '>', $outfile;
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.