in reply to Re: Convert strings with unknown encodings to html
in thread Convert strings with unknown encodings to html
I included examples of each in the above test program. Note that some of the examples are multiple bytes (#1 below, for example, is two characters, one of three bytes and one of two). Best I can tell, the formats are:
1. UTF-8: chr(226).chr(152).chr(134), chr(195).chr(161) 2. CP1252: chr(150), chr(153) 3. HTML: '®', 'Æ' 4. ASCII: '&' 5. Unicode codepoints: chr(63743), chr(991), chr(9760));
Obviously the database is a bit 'special'. Unfortunately it is provided by a 3rd party, a very large company, and I have no control over their input sanitization.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Convert strings with unknown encodings to html
by Anonymous Monk on Jul 01, 2015 at 01:49 UTC |