in reply to Need to limit robot depth using WWW::Robot

I would like to control the depth as it traverses the tree to two or three levels down.

By "two or three levels down," do you mean "links away from the home page", or "levels deep within the site hierarchy," or something else?

Checking the directory depth in your OK_TO_FOLLOW would be easy enough -- just scan $uri->path for the number of slashes: return 0 if ( tr[/][/] > 3 );

Replies are listed 'Best First'.
Re: Need to limit robot depth using WWW::Robot
by Anonymous Monk on Jul 31, 2003 at 20:27 UTC
    Ah. Good question. I did mean levels deep within the site hierarchy. And I like the idea of counting slashes. I'll give it a go. Interesting that this is not obviously (to me) intrinsic to WWW::Robot. I expected that acceptable robot behavior suggests a limit to within-site depth, and that this would somehow be part of the package. Either way, I have a solution. Thanks. -Michael