in reply to Re^4: Crawling Relative Links from Webpages
in thread Crawling Relative Links from Webpages
I know. But how do you get the bit the OP needs? Not like this:
perl -MURI -E"$u=new URI('http://dspace.mit.edu/handle/1721.1/53720'); say $u->ho +st" dspace.mit.edu
Nor any of these:
c:\test>perl -MURI -E"my $u=new URI('http://dspace.mit.edu/handle/1721 +.1/53720'); say $u->authority" dspace.mit.edu c:\test>perl -MURI -E"my $u=new URI('http://dspace.mit.edu/handle/1721 +.1/53720'); say $u->path" /handle/1721.1/53720 c:\test>perl -MURI -E"my $u=new URI('http://dspace.mit.edu/handle/1721 +.1/53720'); say $u->fragment" c:\test>perl -MURI -E"my $u=new URI('http://dspace.mit.edu/handle/1721 +.1/53720'); say $u->opaque" //dspace.mit.edu/handle/1721.1/53720 c:\test>perl -MURI -E"my $u=new URI('http://dspace.mit.edu/handle/1721 +.1/53720'); say $u->canonical" http://dspace.mit.edu/handle/1721.1/53720
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^6: Crawling Relative Links from Webpages
by Anonymous Monk on May 08, 2010 at 04:17 UTC |