in reply to Program that will grep website for specified keyword

I'm going to be radical, and suggest you look at WWW::Robot. This module is intended to go through an entire site pulling down the data, and allowing you to do what you wish with it.

I'd also suggest splitting the program in two parts. The first part pulls down all the data and stores it on local storage, the second greps the local copy. This way you don't have to wait whilst the data is fetched again for a second time if you decide to grep for something else or if you have a bug in the code, and the website maintainer doesn't begin to hate you for taking up silly amounts of bandwidth by getting the site multiple times.

You can use File::Find to simplify the second part too.

Also make sure you obey the robot exclusion rules, and have a delay between getting consecutive URLs so you don't give the server a good kicking.

Replies are listed 'Best First'.
Re: Re: In need of guidance....
by psykosmily (Initiate) on May 06, 2002 at 01:18 UTC
    Thank you very much this was exactly what I was looking for.