This printsuse utf8; sub fileread; # use do { local $/; <$fh> } my $file = 'test_utf8'; # Test 1 binmode STDOUT, ':encoding(UTF-8)'; my $line = fileread $file,':raw'; utf8::decode($line); if ($line =~ /(❇)/) { print "found '$1'\n"; } print $line; sub fileread { my ($file,$enc) = @_; my $string; my $stref = \$string; open(my $fh, "< $enc", $file) || die "Can't open $file: $!"; ${$stref} = do { local $/; <$fh> }; return $string; }
Was (I'm sure) producing this has wide utf8 chars like ‡ (snowflake) a few hours ago but is now crashing the script giving Undefined subroutine &Encode::decode called at - line 18. if binmode is commented out and Wide character at - line 18. if it isn't. Maybe it was utf8::encode giving me the first line, things are getting kinda hazy at this point. It does produce the correct result when used with fileread $file,':raw' or fileread $file,':encoding(ISO-8859-1)'. Interestingly unicode_strings made no difference to the regex succeeding or failing in any of my tests as and utf8::upgrade/downgrade don't appear to do anything at all in this SSCCE. It would be nice to conclude that when in doubt just use utf8::decode but I've also been testing with Net::Async::FastCGI which also gives me a tied STDOUT only it does UTF-8 encoding on it which I need to turn off with set_encoding( undef ); if I do that.my $line = fileread $file,':encoding(UTF-8)'; $line = Encode::decode('UTF-8', $line, 'Encode::FB_CROAK');
ps I notice all the occurrences of ❇ in my code blocks have been turned into ❇ so it's some small comfort that perlmonks.org can't quite get a grip on this either. 😜
In reply to Re^4: FCGI, tied handles and wide characters
by Maelstrom
in thread FCGI, tied handles and wide characters
by Maelstrom
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |