Melly has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks

This is going to be a somewhat vague question I'm afraid...

I have a script that reads some csv and outputs to a different format. Locally, I'm getting a an error when it outputs non-ascii UTF8 ("Wide character in print"), but not on site, even with the same data. Both locations are using Active State - 5.24 locally, and 5.16.2 on site.

In addition, to get a simple case, I wrote a short script that just reads the data and outputs it to a file, with no other processing. However, this doesn't throw an error either locally or on site.

use strict; use warnings; open(IN, '<', $ARGV[0]); open(OUT, '>', 'temp.txt'); while(<IN>){ print OUT; }

So, I guess my questions are:

1. What should I look for/check in terms of the local/site difference?
2. Why does the above script not give me the same error?

Apologies for the somewhat vague question - happy to supply more details if required - oh, and in the example I've been looking at, it was 'proper' single-quotes that were giving the error.

map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2 -$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
Tom Melly, pm (at) cursingmaggot (stop) co (stop) uk

Replies are listed 'Best First'.
Re: UTF8 - Same script, different behaviour
by Corion (Patriarch) on Apr 28, 2020 at 12:21 UTC

    Your example script will never give such an error since you never tell Perl that any of your data might be anything other than Latin-1 bytes. So Perl reads and writes the data as Latin-1 bytes and sees no reason to warn.

    You can only get this warning when you use Encode::decode explicitly or when you (or a module you use) decode your data from a character set encoding to Unicode implicitly, for example by

    open my $fh, '<:encoding(UTF-8)', $filename or die "Couldn't open '$filename': $!";

    Maybe Perl 5.24 warns in more situations than Perl 5.16 did when writing Unicode strings to a file (or somewhere else).

    To correct the situation Perl warns about, be explicit in your output encoding and open your output files as

    open my $fh, '>:encoding(UTF-8)', $filename or die "Couldn't open '$filename': $!";

    ... or encode your strings to bytes before writing them:

    use charnames ':full'; use Encode 'encode'; open my $fh, '>', $filename or die "Couldn't create '$filename': $!"; my $output = "MOT\N{CAPITAL LETTER O WITH DIAERESIS}RHEAD"; my $output_bytes = encode( 'UTF-8', $output ); print $fh $output_bytes;

      Ah - got it (I think)

      My actual script reads the CSV via CSV_XS, with 'binary' on, so I'm guessing that's what alerts Perl.

      Many thanks.

      UPDATE!

      Okay, so I altered my test-script to explicitly open utf-8, and now the test-script throws an error at both locations, although my main script continues to only throw an error locally - go figure.

      In any event, opening my output file with ':encoding(UTF-8)' has fixed the issue.

      map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2 -$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
      Tom Melly, pm (at) cursingmaggot (stop) co (stop) uk