Re^2: Unicode to HTML code &#....;

Thanks to both of you.
The thing is, $string must not contain certain characters like comma and slash, since I use those as separators in my text files. Thats what the regex also ensures, so I think its still the best choice here given the circumstances.
So I now go with

use Encode qw(decode);

sub unicode_decode
{
  my $string = decode('utf8', shift, 0);
  $string =~ tr/\x{FFFD}/\x20/;
  $string =~ s/([^a-zA-Z0-9\_\+\-\.])/'&#'.unpack('U0U*',$1).';'/eg;
  return($string);
}
[download]

As you can see, this also swaps the replacement character with a space should there be one.

Comment on Re^2: Unicode to HTML code &#....; Download Code