Your skill will accomplish what the force of many cannot |
|
PerlMonks |
Re: uri_unescape not correctby Joost (Canon) |
on Apr 02, 2007 at 11:43 UTC ( [id://607801]=note: print w/replies, xml ) | Need Help?? |
There's no general way to know if a URI is UTF-8 encoded or not. See rfc RFC 2396:
In the simplest case, the original character sequence contains only characters that are defined in US-ASCII, and the two levels of mapping are simple and easily invertible: each 'original character' is represented as the octet for the US-ASCII code for it, which is, in turn, represented as either the US-ASCII character, or else the "%" escape sequence for that octet.You could use decode('utf-8',$string) to get the right characters after uri_unescaping, provided the uris are always utf-8 encoded.
In Section
Seekers of Perl Wisdom
|
|