This isn't really a programming question, but I don't know where else to ask. I want to make sure the bot I've created (in Perl) doesn't get caught in an infinite loop. I'm particularly concerned about encountering a URL containing dynamic PATH_INFO. Doesn't some PATH_INFO look like a regular directory? Is it possible that a dynamically created web page will include a link containing path info with some kind of id# that changes each time the page is loaded, but that always points to the same page? Should I handle that by limiting the number of pages parsed from a single domain to some arbitrary number? Any suggestions on what that number should be?
The script is very customized and essentially complete. The only modules it requires is LWP::UserAgent and HTTP::Request.