in reply to Re: CGI.pm: "Malformed UTF-8 character" in apache's error.log
in thread CGI.pm: "Malformed UTF-8 character" in apache's error.log

#!/usr/bin/perl -CS use CGI; use CGI::Carp qw(fatalsToBrowser); my $cgi = new CGI; my $filename = $cgi->param('file'); my $fh = $cgi->upload('file'); binmode $fh; open(OUTPUT, ">/var/www/mypath/$filename") or die $!; while(<$fh>) { print OUTPUT $_; } print $cgi->header(); print "$file has been successfully uploaded... thank you.\n";
does not work (same error) so switching the filehandle/STDIN (I tried both) just before storing the file does not seem to be possible after giving perl the -CS switch.

update: changing binmode $fh; to binmode $fh, ":utf8"; does the trick! (Is this what you meant and does that mean I am still on the right track or is it a hint of a problem?)

Replies are listed 'Best First'.
Re^3: CGI.pm: "Malformed UTF-8 character" in apache's error.log
by ikegami (Patriarch) on Feb 26, 2008 at 19:17 UTC
    • Don't use :utf8 on untrusted data. Use :encoding(UTF-8). For that matter, don't use :utf8.

    • You decode bytes to chars, but you never encode the chars back to bytes when writing them. You'll get wide character warnings for non-ASCII chars, and you could get a mix of iso-latin-1 and UTF-8 characters in more complex programs.

    • What awful inconsistency in your file handle names: $fh and OUTPUT? And using a global variable too?

    Fixed:

    ... my $fh_in = $cgi->upload('file'); binmode $fh_in, ':encoding(UTF-8)'; open(my $fh_out, '>:encoding(UTF-8)', "/var/www/mypath/$filename") or die $!; while (<$fh_in>) { print $fh_out $_; }

    But there's absolutely no reason to convert to chars in the above code, so you'd be better off as

    ... my $fh_in = $cgi->upload('file'); open(my $fh_out, '>', "/var/www/mypath/$filename") or die $!; while (<$fh_in>) { print $fh_out $_; }

    You also mentioned binary uploads (like MP3s). For that, you'd use

    ... my $fh_in = $cgi->upload('file'); open(my $fh_out, '>', "/var/www/mypath/$filename") or die $!; binmode $fh_in; binmode $fh_out; local $/ = \4096; # Don't wait to find "\n" while (<$fh_in>) { print $fh_out $_; }

    The code for binary files also works with text files.