in reply to Re: Safely removing Unicode zero-width spaces and other non-printing characters
in thread Safely removing Unicode zero-width spaces and other non-printing characters

in HTML, it is possible to insert codes that produce UTF characters on the screen

That's a possibility. However, there are also escape codes to allow representing arbitrary Unicode characters, such as "\N{U+NNNN}", which are implemented natively in Perl.

I would write a perl sub that replaces all these specific characters with the HTML equivalent first

No need to write a function yourself: HTML::Entities.

  • Comment on Re^2: Safely removing Unicode zero-width spaces and other non-printing characters