moshkod has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I am using the URI module for constructing URL's, I encountered a problem with param values that are in utf8 Chinese, for some reason when using $uri->as_string it destroys the Chinese values does any one know how to avoid it?

Replies are listed 'Best First'.
Re: problem with URI and utf8
by Firefly258 (Beadle) on Dec 04, 2006 at 10:30 UTC
    Non-US-ASCII characters in URIs are escaped by default as per RFC 2396 to avoid ambiguities across differnt locales and unicode encoding formats. Maybe you see this transformation of your chinese characters as "them being destroyed"?

    use URI; my $URI = new URI "http://example.com/オークシ +ョンなどの"; print $URI->as_string; __END__ ___output___ http://example.com/%E3%82%AA%E3%83%BC%E3%82%AF%E3%82%B7%E3%83%A7%E3%83 +%B3%E3%81%AA%E3%81%A9%E3%81%AE


    perl -e '$,=$",$_=(split/\W/,$^X)[y[eval]]]+--$_],print+just,another,split,hack'er
Re: problem with URI and utf8
by jesuashok (Curate) on Dec 04, 2006 at 09:44 UTC
    moshkod

    could you provide your OS detail ?
    what is the version of perl you are using ?
    Older versions of Perl support "high characters", but not utf8. ( ref_1 )
    UTF8 uses 3 bytes for Chinese characters. ( ref_2 )

    "Keep pouring your ideas"