Re^2: Crawling all urls on a site

Not all links are coded <a href=""> you know...

But as you say, first one has to get all the links, which is relatively trivial, then one has to filter them for already being visited and identical, so you'll need to not only figure out what "../../../" means at any one point, but you'll need to figure out that "../../../index.htm" is the same thing. Though of course it might be "../../../index.html" or "../../../default.htm" or something else again. Is "../../../index.htm?x=y" the same thing?

The short answer is, of course it can be done. But the question is, yet again, why are we trying to do it without modules?

And by the way, what about wget?

($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss')
=~y~b-v~a-z~s; print

Comment on Re^2: Crawling all urls on a site Download Code