in reply to WWW::Mechanize doesn't respect <base>?

You're correct, haj, that things should work even if 'base' is broken. So I did some more research, and here're my results:

use WWW::Mechanize::Link; $u = WWW::Mechanize::Link->new ({url=>'./page2', base=>'http://domain.com/page/'}); print $u->url_abs, "\n"; $u = WWW::Mechanize::Link->new ({url=>'../page2', base=>'http://domain.com/page/'}); print $u->url_abs, "\n"; $u = WWW::Mechanize::Link->new ({url=>'./page2', base=>'http://domain.com/page'}); print $u->url_abs, "\n"; $u = WWW::Mechanize::Link->new ({url=>'../page2', base=>'http://domain.com/page'}); print $u->url_abs, "\n";

The output is:

http://domain.com/page/page2 http://domain.com/page2 http://domain.com/page2 http://domain.com/../page2

As you can see, for links that start with ./ the base MUST NOT end with /, which for links that start with ../ the base MUST end with /. So, whether or not the <base> is honored, some links will be broken. Any cure?

Replies are listed 'Best First'.
Re^2: WWW::Mechanize doesn't respect <base>?
by haj (Vicar) on May 04, 2021 at 11:23 UTC

    What you are showing here is just resolution of relative URLs. Section 5.2 of RFC 3986 has the gory details. The only difference is the fourth example, where the result should be http://domain.com/page2. Note that a trailing slash is significant in an URL - but whether there's more stuff after the rightmost slash is not. for the URL's purpose as a base URL.

    Could you please show what you are expecting?