in reply to Script generated site index db

A few other points to clarify...

Some of the site content is dynamically generated through perl scripts. Using a robot would allow me to index this content as well. As Shotgunefx pointed out, a low-level file dump would not produce the results I am looking for.

In any event, the script I am thinking of making would allow for things like multi-domain searches (like if you had "http://www.mysite.com" and "http://search.mysite.com" and etc.) but will also allow you to index remote sites as well (if you wanted to; probably piss off the domain owner if they found out).

If this sounds like a good idea and you happen to have some first-hand experience with a mod(s) that might help me achieve my proposed end-result, let me know. Also, if you have sample code to invoke those mods, please pass that along too! Thanx.

======================
Sean Shrum
http://www.shrum.net

Replies are listed 'Best First'.
Re: Re: Script generated site index db
by shotgunefx (Parson) on Mar 19, 2002 at 09:22 UTC
    Generically indexing dynamic pages might be tricky. For a start on the indexing, you may want to look at merlyns parallel link checker column.

    -Lee

    "To be civilized is to deny one's nature."
Re: Re: Script generated site index db
by stephane (Monk) on Mar 19, 2002 at 09:26 UTC

    Maybe not directly a Perl solution, but you might want to check SWISH-E