mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:
Greetings. The following code should show the output "=マリウス" but shows "=xn--caaba8k0b0a7jzpccc.com" instead.
#!/usr/bin/perl use utf8; use URI; use Encode; use strict; use warnings; my $href="https://\x{30de}\x{30ea}\x{30a6}\x{30b9}.com/"; print $href,"\n"; my $uri = URI->new($href); my $domain = $uri->host; print ":",$domain,"\n"; $domain = Encode::decode('utf-8', $domain); print "=",$domain,"\n"; $domain = Encode::encode('utf-8', $domain); print ".",$domain,"\n"; exit(0);
What is a good way to get the variable $domain to contain "マリウス" as UTF-8? I've tried Encode::encode and Encode::decode in several permutations but that is probably not the right way. Is there some way to wrap the URI function in such a way as to have it process Unicode?
ps. This web form has trouble with the Japanese as well and has converted the string to a bunch of HTML entities.
Edit: added use utf8; and redid $href definition.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: CPAN's URI.pm versus Japanese as Unicode?
by 1nickt (Canon) on Dec 11, 2022 at 13:06 UTC | |
by mldvx4 (Hermit) on Dec 12, 2022 at 12:49 UTC | |
|
Re: CPAN's URI.pm versus Japanse as Unicode?
by haukex (Archbishop) on Dec 11, 2022 at 10:15 UTC | |
by mldvx4 (Hermit) on Dec 11, 2022 at 12:21 UTC | |
by haukex (Archbishop) on Dec 11, 2022 at 12:50 UTC | |
|
Re: CPAN's URI.pm versus Japanse as Unicode?
by Corion (Patriarch) on Dec 11, 2022 at 10:12 UTC | |
by mldvx4 (Hermit) on Dec 12, 2022 at 07:55 UTC |