LanX has asked for the wisdom of the Perl Monks concerning the following question:

UPDATE never mind, couldn't reproduce the problem at home, uri_escape_utf8() works like documented. Seems to be due to problems at work.

Hi

I'm trying to process unicode-strings in the same way in JS and Perl.

According to URI::Escape

Note: JavaScript has a function called escape() that produces the sequence "%uXXXX" for chars in the 256 .. 65535 range. This function has really nothing to do with URI escaping but some folks got confused since it "does the right thing" in the 0 .. 255 range. Because of this you sometimes see "URIs" with these kind of escapes. The JavaScript encodeURIComponent() function is similar to uri_escape_utf8().

But my tests show that this might be wrong, encodeURIComponent() produces rather the same like &uri_escape.

JS:

[19:05:09.782] encodeURIComponent("ダニエル") [19:05:09.789] "%E3%83%80%E3%83%8B%E3%82%A8%E3%83%AB"

Perl:

DB<48> p uri_escape("&#12480;&#12491;&#12456;&#12523;") %E3%83%80%E3%83%8B%E3%82%A8%E3%83%AB DB<49> p uri_escape_utf8("&#12480;&#12491;&#12456;&#12523;") %C3%A3%C2%83%C2%80%C3%A3%C2%83%C2%8B%C3%A3%C2%82%C2%A8%C3%A3%C2%83%C2% +AB

Am I missing something or is that a bug in the documentation?

Cheers Rolf

PS: seems like the monastery doesn't like 3 Byte Unicodes, those characters were from a Japanese page.

Replies are listed 'Best First'.
Re: same UTF8 escaping in JS and Perl?
by Anonymous Monk on May 11, 2012 at 17:35 UTC

    Am I missing something or is that a bug in the documentation?

    you're missing perl code, something with Data::Dump::dd -ed that doesn't rely on html escapes and other