in reply to CPAN's URI.pm versus Japanese as Unicode?
I see two problems here: first, your source file is not declared as UTF-8 with use utf8;, which means that my $href="https://マリウス.com/"; is actually giving the string "https://\343\203\236\343\203\252\343\202\246\343\202\271.com/". Second, URI is encoding that with Punycode, which IMHO is one correct approach, as the URI documentation states that it works with URIs as per RFC 2396 and RFC 2732, which I think only support US-ASCII.
If you add the use utf8;, you get the output =xn--gckvb8fzb.com, which is the correct Punycode domain name of "マリウス.com" ("\x{30de}\x{30ea}\x{30a6}\x{30b9}.com").
What is unclear to me is what your goal is? Why do you (think you) need a URI object with unicode characters in it?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: CPAN's URI.pm versus Japanse as Unicode?
by mldvx4 (Hermit) on Dec 11, 2022 at 12:21 UTC | |
by haukex (Archbishop) on Dec 11, 2022 at 12:50 UTC |