I am processing some XHTML pages (using XML::Twig) that contain numerous character entities, such as:
é
When I parse these files using XML::Twig, they turn into all sorts of wonky characters that look nothing like they did in the original HTML.
réservebecomes
réserve
I've tried setting keep_encoding in Twig, and the entities get preserved, but I get another set of wonky characters when that output goes to HTML.
I'm not sure how to proceed here -- any thoughts? I'm sure there's some kind of encoding/decoding process I need to do here, but I'm unfamiliar with the process.
Many thanks.
ScottIn reply to Encoding/decoding question by slugger415
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |