in reply to Re: Crawling Relative Links from Webpages
in thread Crawling Relative Links from Webpages
Well OK. The point is how do you determine $url? In this case, the $url is "http://dspace.mit.edu" and it is not at all obvious from the webpage (looking at the source) how one would say that this is the server. I have a million different kinds of such webpages from different servers. I need a method that is generic enough to work with all of them.
Any suggestions anyone?
Andy
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Crawling Relative Links from Webpages
by BrowserUk (Patriarch) on May 08, 2010 at 01:42 UTC | |
by Anonymous Monk on May 08, 2010 at 03:44 UTC | |
by BrowserUk (Patriarch) on May 08, 2010 at 03:54 UTC | |
by Anonymous Monk on May 08, 2010 at 04:17 UTC |