Re^2: Creating a web crawler (theory)

A full URL forces the browser to revisit DNS? Where did you get that idea? Even if you have some wacky set-up where you aren't caching replies, it doesn't affect rendering. As for needless traffic, a DNS query isn't much compared to all those images we ask our browser to download.

Relative URLs are a convenience for our typing. To follow a link, the browser still needs to make it an absolute URL, then go where that URL says. A relative URL in an HTML page is not a secret signal to the browser to use some sort of quick fetching algorithm.

You might be thinking about the difference between external and internal redirections. An external redirection is a full HTTP response that cause the user-agent to fetch the resource from a different URL. An internal redirection can be caught by the web server and handled without another request from the user-agent. Neither of these have anything to do with HTML though.

--
brian d foy <bdfoy@cpan.org>

Comment on Re^2: Creating a web crawler (theory)

Replies are listed 'Best First'.
Re^3: Creating a web crawler (theory) by gaal (Parson) on Jan 29, 2005 at 14:10 UTC
Relative URLs do better than save typing. They save retyping. If you move a project inside a site or just rename it, with relative paths you don't have to hunt down all the links and change them.	[reply]
Re^4: Creating a web crawler (theory) by brian_d_foy (Abbot) on Jan 29, 2005 at 15:42 UTC
You don't have to hunt down and re-type the links when you use a Perl script to do it for you. :) -- brian d foy <bdfoy@cpan.org>	[reply]
Re^5: Creating a web crawler (theory) by gaal (Parson) on Jan 29, 2005 at 16:26 UTC
But then you have to write and debug a Perl script :) (Have it treat programmatically constructed paths etc.)	[reply]
Re^6: Creating a web crawler (theory) by brian_d_foy (Abbot) on Jan 29, 2005 at 18:06 UTC