Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I found URI::query_form and URI::queryparam always encode space into + (but URI::Escape don't )
use strict; use warnings; use URI; use URI::QueryParam; use URI::Escape; my $t1_str = uri_escape('2022-12-18 12:19:57'); print $t1_str; #it what I want space to %20 print "\n"; my $u = URI->new("/api/Data", "http"); $u->query_param(t1 => '2022-12-18 12:19:57'); print $u->as_string; #here space change into +
Since I'm a newbie about URI, before I do some dirty work, is there a easy/nice way to do this work in URL? TIA.

Replies are listed 'Best First'.
Re: How encode space into %20 in URI
by haukex (Archbishop) on Dec 22, 2022 at 09:52 UTC
    is there a easy/nice way to do this work in URL?

    Could you explain to us why you need this? It is perfectly valid for spaces to be escaped as +, it's the standard as per HTML 4, and it's optional in HTML 5.

Re: How encode space into %20 in URI
by kcott (Archbishop) on Dec 22, 2022 at 09:21 UTC

    Firstly, I can reproduce what you're seeing.

    There have been a number of changes to the "URI distribution" recently. These include merging URI::QueryParam methods into URI as well as some changes to uri_escape() — see the Changes file.

    The latest version of URI is 5.17; I've used this in the code below. I don't know if the '%20' vs. '+' issue has always existed, or if it's something that was inadvertently introduced with some change. It would make sense that the escaping mechanism was consistent across modules in the same distribution: I recommend that you raise a bug report.

    The annotated code below shows a reproduction of '%20' vs. '+', two unsuccessful workaround attempts, and two successful workaround attempts.

    #!/usr/bin/env perl use strict; use warnings; use URI 5.17; use URI::Escape 5.17; print "OS[$^O] perl[$^V]\n"; # uri_escape() gives %20 my $str = '2022-12-18 12:19:57'; print "\$str[$str]\n"; my $esc_str = uri_escape($str); print "\$esc_str[$esc_str]\n"; # query_param() gives + { my $u = URI->new('/api/Data', 'http'); print 'URI init=[', $u->as_string(), "]\n"; $u->query_param(t1 => '2022-12-18 12:19:57'); print 'URI raw_param=[', $u->as_string(), "]\n"; } # If escaped string is used, %20 -> %2520 { my $u = URI->new('/api/Data', 'http'); $u->query_param(t1 => $esc_str); print 'URI esc_param=[', $u->as_string(), "]\n"; } # If string escaped in situ, still %20 -> %2520 { my $u = URI->new('/api/Data', 'http'); $u->query_param(t1 => uri_escape('2022-12-18 12:19:57')); print 'URI in_situ_esc_param=[', $u->as_string(), "]\n"; } # You can modify the query string to change + to %20 { my $u = URI->new('/api/Data', 'http'); $u->query_param(t1 => '2022-12-18 12:19:57'); my $query = $u->query(); $query =~ s/\+/%20/g; $u->query($query); print 'URI long_re_sub_param=[', $u->as_string(), "]\n"; } # Perl 5.14 and later has the /r modifier: # use for a more succinct version of the above # with identical output. { my $u = URI->new('/api/Data', 'http'); $u->query_param(t1 => '2022-12-18 12:19:57'); $u->query($u->query() =~ s/\+/%20/gr); print 'URI rmod_re_sub_param=[', $u->as_string(), "]\n"; }

    Output:

    OS[cygwin] perl[v5.36.0] $str[2022-12-18 12:19:57] $esc_str[2022-12-18%2012%3A19%3A57] URI init=[/api/Data] URI raw_param=[/api/Data?t1=2022-12-18+12%3A19%3A57] URI esc_param=[/api/Data?t1=2022-12-18%252012%253A19%253A57] URI in_situ_esc_param=[/api/Data?t1=2022-12-18%252012%253A19%253A57] URI long_re_sub_param=[/api/Data?t1=2022-12-18%2012%3A19%3A57] URI rmod_re_sub_param=[/api/Data?t1=2022-12-18%2012%3A19%3A57]

    — Ken

      I don't know if the '%20' vs. '+' issue has always existed, or if it's something that was inadvertently introduced with some change.

      I personally wouldn't call it an "issue" because AFAIK* it's optional whether spaces should be escaped as %20 or +, and it's been that way for a long time. From URI's Changes file:

         2001-01-10   Gisle Aas <gisle@ActiveState.com>
       
         Release 1.10
       
         The $u->query_form method will now escape spaces in
         form keys or values as '+' (instead of '%20').  This also
         affect the $mailto_uri->header() method.  This is actually
         the wrong thing to do, but this practise is now even
         documented in official places like
         http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1
         so we might as well follow the stream.
      
        "I personally wouldn't call it an "issue" because AFAIK it's optional whether spaces should be escaped as %20 or +, ..."

        If you look in the paragraph where I used the word "issue", you'll see:

        "It would make sense that the escaping mechanism was consistent across modules in the same distribution"

        I consider the inconsistency to be an issue. I made no reference to %20 or + being better, preferred, more correct, or anything else of that ilk.

        "... and it's been that way for a long time."

        Yes, I know. I was coding such escapes more than two decades ago.

        "From URI's Changes file:"

        I would question the relevance of including that entry from almost 22 years ago; it even references HTML4 which certainly isn't current. All that I'm getting from it is: "We changed %20 to + in some places, even though we believed that to be wrong, and left everything else in an inconsistent state".

        — Ken