Re: To decode URL-decoded UTF-8 string.
by choroba (Cardinal) on Aug 28, 2018 at 09:57 UTC
|
You need URL::Encode to decode the percent notation into octets, and Encode to turn the octets into unicode characters:
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
use open ':encoding(UTF-8)', ':std';
use Encode;
use URL::Encode qw{ url_decode };
my $string = '%d0%be%d0%b1/%d1%81%d1%82%d0%b5%d0%bd';
my $octets = url_decode($string);
my $unicode = decode('UTF-8', $octets);
say $unicode;
($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord
}map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
| [reply] [d/l] [select] |
|
I suppose your solution is the best. But i can not use it as i can not find the Devuan package for that. Do you know one, by the way? I do not want to support myself the one, downloaded from CPAN, it is better when it goes from a distro.
| [reply] |
Re: To decode URL-decoded UTF-8 string.
by TheloniusMonk (Sexton) on Aug 28, 2018 at 09:53 UTC
|
You mean you want to decode url-encoded and the result happens to be in UTF-8
while(<>) {
print &urldecode($_);
}
sub urlencode {
my $s = shift;
$s =~ s/ /+/g;
$s =~ s/([^A-Za-z0-9\+-])/sprintf("%%%02X", ord($1))/seg;
return $s;
}
sub urldecode {
my $s = shift;
$s =~ s/\%([A-Fa-f0-9]{2})/pack('C', hex($1))/seg;
$s =~ s/\+/ /g;
return $s;
}
Produces from your input: об/стен | [reply] [d/l] |
|
print urlencode("41 + 1 = 42");
# 41 1 = 42
use URI::Escape;
print uri_escape("41 + 1 = 42");
# 41%20%2B%201%20%3D%2042
| [reply] [d/l] |
|
Wow! Awesome! That's exactly what i looked for! I read perlfunc man for the pack function, but thought that i need to use h template, and not c . So, my approach failed. Thank you veru much, TheloniusMonk!
| [reply] |
Re: To decode URL-decoded UTF-8 string.
by Your Mother (Archbishop) on Aug 28, 2018 at 10:04 UTC
|
Same answer as others but since it's a one-liner and I did it before I saw they posted and I'm just slow, I'll add it.
perl -CSD -MEncode -MURI::Encode=uri_decode -le 'print decode("utf-8",uri_decode("%d0%be%d0%b1/%d1%81%d1%82%d0%b5%d0%bd"))'
об/стен
| [reply] |
|
| [reply] |
Re: To decode URL-decoded UTF-8 string.
by thanos1983 (Parson) on Aug 28, 2018 at 09:55 UTC
|
Hello nikolay,
Is this working for you?
#!/usr/bin/perl
use strict;
use warnings;
use Encode;
use URI::Escape;
binmode STDOUT, ":utf8";
my $in = "%d0%be%d0%b1/%d1%81%d1%82%d0%b5%d0%bd";
my $text = Encode::decode('utf8', uri_unescape($in));
print $text . "\n";
__END__
$ perl test.pl
об/стен
Update: Some time ago there was a similar question PDF::API2 printing non ascii characters. Although the tittle is not the same check it out it will help to review some information.
Looking forward to your reply, BR.
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] |
|
No. I have tried that already before. It only changes %-chars to \x-ones.
| [reply] |
|
#!/usr/bin/perl
use utf8;
use strict;
use warnings;
use URI::Escape;
use feature 'say';
use Encode qw/ decode /;
binmode STDOUT, ':utf8';
sub decodedUri {
return decode 'UTF-8', uri_unescape( shift );
}
say decodedUri('%d0%be%d0%b1/%d1%81%d1%82%d0%b5%d0%bd');
__END__
$ perl test.pl
об/стен
BR / Thanos
Seeking for Perl wisdom...on the process of learning...not there...yet!
| [reply] [d/l] |
|