in reply to Unicode surrogate is illegal in UTF-8
#!/usr/bin/perl use strict; use warnings; use open OUT => ':encoding(UTF-8)', ':std'; use warnings FATAL => 'utf8'; my $text = { string => "t\x{daed}\x{ffff}\x{daee}\x{c8}\n" }; 1 until eval { print $text->{string}; 1; } or do { my ($charcode) = $@ =~ /U\+(\S+)/ or die $@; print STDERR "Removing $charcode because of $@"; $text->{string} =~ s/\x{$charcode}//g; 0; # Try again! };
Update: handles both "non-character" and "surrogate" cases. I wasn't able to trigger the "non_unicode" warnings.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Unicode surrogate is illegal in UTF-8
by Rodster001 (Pilgrim) on Aug 03, 2015 at 19:11 UTC | |
by Rodster001 (Pilgrim) on Aug 03, 2015 at 20:05 UTC | |
|
Re^2: Unicode surrogate is illegal in UTF-8
by Rodster001 (Pilgrim) on Aug 03, 2015 at 20:33 UTC | |
by choroba (Cardinal) on Aug 03, 2015 at 21:23 UTC | |
by afoken (Chancellor) on Aug 05, 2015 at 03:33 UTC |