bliako has asked for the wisdom of the Perl Monks concerning the following question:
Esteemed Monks,
I wonder if it is expected behaviour from URI's url absolution (ok absolutisation) to still retain relative path components in final url. For example (using perl 5.22, URI v1.73, LWP::UserAgent v6.33) :
yields:!/usr/bin/env perl use strict; use warnings; use URI; # a relative url: my $rel_url = '../../../../../abc.html'; # the base url, where I stand now: my @base_uris = ('http://server.com/123/xyz', 'http://server.com/1/2/3/4/5', 'http://server.com/1/2/3/4/5/'); # URI's absolute url: foreach my $abase (@base_uris){ my $uri = URI->new_abs( $rel_url, $abase ); print "absolute for base: $abase is\n\t".$uri."\n"; }
absolute for base: http://server.com/123/xyz is http://server.com/../../../../abc.html absolute for base: http://server.com/1/2/3/4/5 is http://server.com/../abc.html absolute for base: http://server.com/1/2/3/4/5/ is http://server.com/abc.html
The last response is correct but I wonder whether for the first two cases URI should have used some heuristics to remove that '..'.
The reason is that recently I had a brief encounter with LWP::UserAgent (UA) and, subsequently, URI (described in detail here http://perlmonks.org/?node_id=1210570):
In summary, on receiving a "302 Found" server response, UA would by default follow the redirect by extracting the 'Location' item from the server's response headers. However, it was a twisted server. As a result it sent the 'Location' to follow as a relative url. Something similar to '$rel_url' in my example.
UA then proceeded to absolutise the received url (based on initial request url) to follow, using URI. Here is the relevant extract from LWP::UserAgent.pm (sub request())
# Some servers erroneously return a relative URL for redirects, # so make it absolute if it not already is. local $URI::ABS_ALLOW_RELATIVE_SCHEME = 1; my $base = $response->base; $referral_uri = "" unless defined $referral_uri; $referral_uri = $HTTP::URI_CLASS->new($referral_uri, $base)->abs($base);
The result is YET ANOTHER relative url in $referral_uri which UA requests() and error ensues with a server 500 response (relative url forbidden). Which may puzzle someone and cause long debugging.
So, my questions/request are:
Should URI be using heuristics to absolutise urls?
If not, should LWP::UserAgent be using its own heuristics once URI returns a pseudo-absolute url? Or maybe LWP::UserAgent should die before making the follow-up request.
bliako
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: URI: making absolute urls
by choroba (Cardinal) on Mar 29, 2018 at 13:59 UTC | |
by bliako (Abbot) on Mar 30, 2018 at 18:34 UTC | |
|
Re: URI: making absolute urls
by ikegami (Patriarch) on Mar 29, 2018 at 12:52 UTC | |
by bliako (Abbot) on Mar 29, 2018 at 13:08 UTC |