Maybe, you saved your script with utf-8 encoding. If you save the script as iso-8859-1, you will get iso-8859-1 result.
Below, 082.pl is utf-8 saved script and 082-1 is iso-8859-1 saved script."ü" is "c3 bc" in utf-8. "fc" in iso-8859-1.
>cat 082.pl |perl -ne 'print $1 if m!<word>(.*?)</word>!' | hd 00000000 4d c3 bc 6c 6c 65 72 |M..ller| 00000007 >cat 082-1.pl |perl -ne 'print $1 if m!<word>(.*?)</word>!' | hd 00000000 4d fc 6c 6c 65 72 |M.ller| 00000006 >
In reply to Re^3: UTF-8 and XML::Parser
by remiah
in thread UTF-8 and XML::Parser
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |