in reply to Unicode surrogate is illegal in UTF-8

You can use eval with FATAL warnings:
#!/usr/bin/perl use strict; use warnings; use open OUT => ':encoding(UTF-8)', ':std'; use warnings FATAL => 'utf8'; my $text = { string => "t\x{daed}\x{ffff}\x{daee}\x{c8}\n" }; 1 until eval { print $text->{string}; 1; } or do { my ($charcode) = $@ =~ /U\+(\S+)/ or die $@; print STDERR "Removing $charcode because of $@"; $text->{string} =~ s/\x{$charcode}//g; 0; # Try again! };

Update: handles both "non-character" and "surrogate" cases. I wasn't able to trigger the "non_unicode" warnings.

لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Replies are listed 'Best First'.
Re^2: Unicode surrogate is illegal in UTF-8
by Rodster001 (Pilgrim) on Aug 03, 2015 at 19:11 UTC
    This doesn't seem to work for me. It reports the warning, but $@ does not get set.
      Nevermind, this worked for me (unrelated typo). Thanks!
Re^2: Unicode surrogate is illegal in UTF-8
by Rodster001 (Pilgrim) on Aug 03, 2015 at 20:33 UTC
    Could I generate/detect this warning without using "print" (i.e. so I could fix/replace silently)?
      You can print to a filehandle that doesn't lead anywhere:
      open my $VOID, '>', \ my $void; 1 until eval { print {$VOID} $text->{string}; # ...
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      Could I generate/detect this warning without using "print" (i.e. so I could fix/replace silently)?

      Have a look at Handling Malformed Data.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)