Why does it do this? Because index.html in the /dvt/ directory may have relative links (It does in this example). If the page links to the relative URI "top.htm", but the browser is looking at a URI of /dvt, it will try to load /top.htm when the link is followed, and not /dvt/top.htm as we would expect.
When I run your script, it complains in line 48 about lack of arguments to $thisuri->abs($cururi). You have never set $cururi in your code! I modified things a bit and came up with something that works on the URL you give:
$result->base returns the base URL of the HTTP response, since when you request /dvt, you get relocated to /dvt/. You must use this value as the base URL of your relative URIs. If you try to set $cururi = $url;, you'll get 404 errors during your recursion when trying to access /top.htm, etc.. not /dvt/top.htm. After modifying these lines, I get this output:if ($result->is_success) { $cururi = $result->base->as_string; print "URL: $url ($cururi)\n";
Notice the first URL: line, which has a different request URL than response base URL (in parentheses).$ perl lwp-paco.pl http://users.pandora.be/dvt URL: http://users.pandora.be/dvt (http://users.pandora.be/dvt/) URL: http://users.pandora.be/dvt/top.htm (http://users.pandora.be/dvt/ +top.htm) URL: http://users.pandora.be/dvt/top.htm (http://users.pandora.be/dvt/ +top.htm) URL: http://users.pandora.be/dvt/tree.htm (http://users.pandora.be/dvt +/tree.htm) URL: http://users.pandora.be/dvt/start.htm (http://users.pandora.be/dv +t/start.htm)
blokhead
In reply to Re: Lack of Trailing Slash Confuses URI
by blokhead
in thread Lack of Trailing Slash Confuses URI
by jonjacobmoon
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |