Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
This works fine for Kerry's site but when I get/parse a page from Bush's site I get some of the characters like apostrophes, quotes, dashes, etc., in an odd encoding. Most display as three character sequences beginning with a-hat, as in a-hat euro-sign vertical bar. There are also some single character things like a dotted cap-A, etc.
In the source code for the page, they seem to be normal characters or HTML tags. See example at http://www.georgewbush.com/News/Read.aspx?ID=2768
Does anyone know what these characters are and how to translate them into regular ascii characters before I get tham back from HTML::Parser?
Many thanks.... Steve
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: LWP::Simple returns strange encodings
by Joost (Canon) on Jun 24, 2004 at 19:13 UTC | |
by iburrell (Chaplain) on Jun 24, 2004 at 19:43 UTC | |
|
Re: LWP::Simple returns strange encodings
by cormanaz (Deacon) on Jun 24, 2004 at 20:16 UTC |