You're right that it is the LRM character, and so shouldn't be stripped in general (so it's sensible that it doesn't match \s). But it is useless at the end of a string, hence my suggestion that it should be considered something like whitespace in that context. I hoped there might be a function to trim the end of strings for this specific purpose. Or if not, something generic I could add to a RE to strip unicode characters of this nature.
In reply to Re^2: Remove unicode "whitespace"
by HYanWong
in thread Remove unicode "whitespace"
by HYanWong
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |