in reply to Why won't Perl convert (Latin1 | ISO-8859-1) to (UTF-8 | utf8)?
G'day taint,
piconv converts character encodings. Here's an example of ISO-8859-1 to UTF-8 and back again (using the copyright sign):
$ piconv -f ISO-8859-1 -t utf8 -s '©' © $ piconv -t ISO-8859-1 -f utf8 -s '©' ©
piconv does not look for keys such as "charset" or "encoding" and attempt to change their values.
Also, all the characters in the string "iso-8859-1" are ASCII; their values are identical to the Unicode code points of the corresponding characters. Had that meta element contained non-ASCII characters, you would have seen some conversion.
$ piconv -f ISO-8859-1 -t utf8 \ > -s '<meta http-equiv="Content-Type" content="text/html; charset= +iso-8859-1">' <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1 +"> $ piconv -f ISO-8859-1 -t utf8 \ > -s '<meta name="registered sign" content="®">' <meta name="registered sign" content="®">
To convert your HTML files, you'll need to run piconv and also change "iso-8859-1" references to "utf-8". Be aware that there are several places in which encodings might be specified: for instance, meta and script elements may contain a charset attribute and XHTML documents may include encoding attributes.
-- Ken
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Why won't Perl convert (Latin1 | ISO-8859-1) to (UTF-8 | utf8)?
by taint (Chaplain) on Jun 06, 2013 at 14:39 UTC | |
by chromatic (Archbishop) on Jun 06, 2013 at 16:24 UTC | |
by taint (Chaplain) on Jun 06, 2013 at 17:47 UTC | |
by kcott (Archbishop) on Jun 06, 2013 at 23:35 UTC | |
by taint (Chaplain) on Jun 07, 2013 at 21:04 UTC |