in reply to Re: Crawling all urls on a site
in thread Crawling all urls on a site
But as you say, first one has to get all the links, which is relatively trivial, then one has to filter them for already being visited and identical, so you'll need to not only figure out what "../../../" means at any one point, but you'll need to figure out that "../../../index.htm" is the same thing. Though of course it might be "../../../index.html" or "../../../default.htm" or something else again. Is "../../../index.htm?x=y" the same thing?
The short answer is, of course it can be done. But the question is, yet again, why are we trying to do it without modules?
And by the way, what about wget?
($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss')
=~y~b-v~a-z~s; print
|
|---|