in reply to Re: Unicode to HTML code &#....;
in thread Unicode to HTML code &#....;

Thanks to both of you.
The thing is, $string must not contain certain characters like comma and slash, since I use those as separators in my text files. Thats what the regex also ensures, so I think its still the best choice here given the circumstances.
So I now go with
use Encode qw(decode); sub unicode_decode { my $string = decode('utf8', shift, 0); $string =~ tr/\x{FFFD}/\x20/; $string =~ s/([^a-zA-Z0-9\_\+\-\.])/'&#'.unpack('U0U*',$1).';'/eg; return($string); }
As you can see, this also swaps the replacement character with a space should there be one.