in reply to Re^3: Safely removing Unicode zero-width spaces and other non-printing characters
in thread Safely removing Unicode zero-width spaces and other non-printing characters
Yes, the RSS reads fine of course.
The problem is with the pages which the RSS points to. HTML and XHTML is a hot mess. Even when a respectable CMS is used, the authors can still paste in something weird. It is looking like I may have to treat each site individually and making individual filters might not be worth the effort. However, I am hoping for an automated way to normalize incoming text.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^5: Safely removing Unicode zero-width spaces and other non-printing characters
by haukex (Archbishop) on Dec 05, 2019 at 05:49 UTC |