in reply to Safely removing Unicode zero-width spaces and other non-printing characters
Desktop That’s More Elegant
You know, in HTML, it is possible to insert codes that produce UTF characters on the screen, and they exist in case you want the source code to be simple ASCII characters only. No UTF. I prefer that, because as you said, the UTF characters can mess up the code. For example, the above text should be:
Desktop That’s More Elegant
How to encode UTF characters in HTML
If I had the same problem, I would write a perl sub that replaces all these specific characters with the HTML equivalent first, and then just remove all 00 characters from the entire text and deal with the spaces and line breaks last.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Safely removing Unicode zero-width spaces and other non-printing characters
by haukex (Archbishop) on Dec 04, 2019 at 19:11 UTC |