Re^2: HTTP::Request::Common::POST and UTF-8

Won't that escape data twice? Without actually running it, it looks like
"\x{1234}"
would be transformed by uri_escape_utf8 into
"%C8%B4"
which would be transformed by POST into
"%25C8%25B4"
while the right answer would be
"%C8%B4"

What he actually needs is

my $request = POST(
   "http://localhost/test",
   Content => [
      data      => encode("UTF-8", $utf8_data),
      more_data => "some more data",
   ]
);
[download]

The core problem is that the url-encoded format didn't anticipate data using character sets other than US-ASCII. There is a defacto standard, which consists of encoding a string as UTF-8, and escaping the resulting bytes as if they were encoded using US-ASCII. The above converts the string to UTF-8 bytes, which will be subsequently escaped by POST's guts.

Comment on Re^2: HTTP::Request::Common::POST and UTF-8 Select or Download Code

Replies are listed 'Best First'.
Re^3: HTTP::Request::Common::POST and UTF-8 by scollyer (Sexton) on Sep 28, 2005 at 18:48 UTC
>Won't that escape data twice? Yup, just discovered that. Your solution appears to work correctly, with the corresponding unescaping being: `decode("UTF-8", uri_unescape($req_string))` [download] Thanks for this. I think I'll go and hit myself with a stick now. It'll be less pain than doing UTF-8 in Perl ... Steve Collyer	[reply] [d/l]