jonjacobmoon has asked for the wisdom of the Perl Monks concerning the following question:
This one seems so obvious that there must be a solution, but I can't find it. Could there be a bug in URI?
The code below works on this url only if I add a slash. Problem is that this a program run on URLs that may or may not have a slash. I have corrected it with some regexes that added the slash at the end if it is not there, but I am wondering why URI is not smart enough to figure this out on its own.
#!/usr/bin/perl -w use strict; use LWP::UserAgent; use HTML::Parser; use URI; my $starturl = shift || die "No url supplied\n"; #"http://www.strathav +en.s-lanark.sch.uk/pages/ring.htm"; # my $baseuri = URI->new($starturl); my $cururi; my $url; my @urls ; push @urls,$starturl; my $agent = new LWP::UserAgent; my $parser = HTML::Parser->new(api_version => 3, start_h => [\&start ,"tagname, attr" +]); $agent->agent("Jonzilla/666"); while( $url = shift @urls) { my $request = new HTTP::Request 'GET' => $url; my $result = $agent->request($request); if ($result->is_success) { print "URL: $url\n"; #print $result->as_string; $parser->parse($result->content); } else { print "Error: " . $result->status_line . " URL=$url, $baseuri\ +n"; } } sub start { my($tag,$attr) = @_; if ($tag eq 'frame' ) { my $thisuri = URI->new($attr->{src}); push @urls, $thisuri->abs($cururi); } }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Lack of Trailing Slash Confuses URI
by blokhead (Monsignor) on Sep 21, 2002 at 17:01 UTC | |
by jonjacobmoon (Pilgrim) on Sep 21, 2002 at 17:22 UTC |