in reply to Lack of Trailing Slash Confuses URI
Why does it do this? Because index.html in the /dvt/ directory may have relative links (It does in this example). If the page links to the relative URI "top.htm", but the browser is looking at a URI of /dvt, it will try to load /top.htm when the link is followed, and not /dvt/top.htm as we would expect.
When I run your script, it complains in line 48 about lack of arguments to $thisuri->abs($cururi). You have never set $cururi in your code! I modified things a bit and came up with something that works on the URL you give:
$result->base returns the base URL of the HTTP response, since when you request /dvt, you get relocated to /dvt/. You must use this value as the base URL of your relative URIs. If you try to set $cururi = $url;, you'll get 404 errors during your recursion when trying to access /top.htm, etc.. not /dvt/top.htm. After modifying these lines, I get this output:if ($result->is_success) { $cururi = $result->base->as_string; print "URL: $url ($cururi)\n";
Notice the first URL: line, which has a different request URL than response base URL (in parentheses).$ perl lwp-paco.pl http://users.pandora.be/dvt URL: http://users.pandora.be/dvt (http://users.pandora.be/dvt/) URL: http://users.pandora.be/dvt/top.htm (http://users.pandora.be/dvt/ +top.htm) URL: http://users.pandora.be/dvt/top.htm (http://users.pandora.be/dvt/ +top.htm) URL: http://users.pandora.be/dvt/tree.htm (http://users.pandora.be/dvt +/tree.htm) URL: http://users.pandora.be/dvt/start.htm (http://users.pandora.be/dv +t/start.htm)
blokhead
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Lack of Trailing Slash Confuses URI
by jonjacobmoon (Pilgrim) on Sep 21, 2002 at 17:22 UTC |