In general, if you access a URI aimed at a directory but don't have the trailing slash, you get redirected to the URI with the trailing slash included. When I type in the users.pandora.be/dvt URL in my browser, it changes to /dvt/ with a trailing slash.

Why does it do this? Because index.html in the /dvt/ directory may have relative links (It does in this example). If the page links to the relative URI "top.htm", but the browser is looking at a URI of /dvt, it will try to load /top.htm when the link is followed, and not /dvt/top.htm as we would expect.

When I run your script, it complains in line 48 about lack of arguments to $thisuri->abs($cururi). You have never set $cururi in your code! I modified things a bit and came up with something that works on the URL you give:

if ($result->is_success) { $cururi = $result->base->as_string; print "URL: $url ($cururi)\n";
$result->base returns the base URL of the HTTP response, since when you request /dvt, you get relocated to /dvt/. You must use this value as the base URL of your relative URIs. If you try to set $cururi = $url;, you'll get 404 errors during your recursion when trying to access /top.htm, etc.. not /dvt/top.htm. After modifying these lines, I get this output:
$ perl lwp-paco.pl http://users.pandora.be/dvt URL: http://users.pandora.be/dvt (http://users.pandora.be/dvt/) URL: http://users.pandora.be/dvt/top.htm (http://users.pandora.be/dvt/ +top.htm) URL: http://users.pandora.be/dvt/top.htm (http://users.pandora.be/dvt/ +top.htm) URL: http://users.pandora.be/dvt/tree.htm (http://users.pandora.be/dvt +/tree.htm) URL: http://users.pandora.be/dvt/start.htm (http://users.pandora.be/dv +t/start.htm)
Notice the first URL: line, which has a different request URL than response base URL (in parentheses).

blokhead


In reply to Re: Lack of Trailing Slash Confuses URI by blokhead
in thread Lack of Trailing Slash Confuses URI by jonjacobmoon

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.