in reply to Convert HTML symbols to equivalent Unicode

In some cases I get a XML parser error.

Which module(s) are you using? What's the exact error? ("The symbol ® ( REGISTERED SIGN ) need to be convert to its equivalent unicode U00AE" doesn't look like the error message — at least Google doesn't find any exact match (even with grammar fixed); and as we don't know what module's source code to grep, ...)

As it is, the problem is underspecified. In order to give advice on what to do, we'd need to know what input encoding you have, and what output encoding you need. The U+00AE you mention is just the unicode codepoint, i.e. a mere number, which would always need to be encoded somehow for transmission and storage, e.g. as UTF-8, UTF-16LE, HTML entities, whatever...  Similarly, we can only guess that your input is maybe in ISO-Latin-1 encoding.

  • Comment on Re: Convert HTML symbols to equivalent Unicode