in reply to HTML parsing module handles known and unknown encoding
It seems that XML::LibXML has thought about the problem and solved in the way that you should always pass octets to XML::LibXML. If you have an encoding handy, you're allowed to tell XML::LibXML about it, but it's not necessary.
I'm not sure how well XML::LibXML works with UTF-16LE and/or UTF-16BE and BOMs - you might need to use some regular (byte-)expressions to handle the BOM yourself.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: HTML parsing module handles known and unknown encoding
by ikegami (Patriarch) on Nov 16, 2011 at 18:55 UTC | |
by ambrus (Abbot) on Nov 17, 2011 at 08:50 UTC | |
by ikegami (Patriarch) on Nov 17, 2011 at 09:56 UTC | |
by grantm (Parson) on Nov 16, 2011 at 21:07 UTC | |
by ikegami (Patriarch) on Nov 16, 2011 at 22:19 UTC | |
by grantm (Parson) on Nov 17, 2011 at 23:42 UTC | |
by Corion (Patriarch) on Nov 16, 2011 at 19:07 UTC |