xorl has asked for the wisdom of the Perl Monks concerning the following question:

I'm sure this is one of those things that has already been coded and is out there somewhere. I'm looking for something that given a URL will spider the site and store a copy of the site locally.

I have a pretty good idea of how to go about writing one. I'm just lazy. I figure if no one can find something out there, I can always take linklint and make it save the files after it checks it (is that a good idea?).

Thanks in advance.

Replies are listed 'Best First'.
Re: website archiver
by doom (Deacon) on Jan 13, 2009 at 04:31 UTC
    I think you're looking for the "wget" command. Myself I tend to do this, but if you're doing this for archival purposes you might prefer to do it differently (e.g. without "-k" of "-H", and maybe without "-l"):
    wget -r -l 8 -w 100 -k -p -np -H <URL>
    Briefly what these options do (read the man page):
    -r is recursive -l is the max depth -w is the wait, seconds between retrievals -k convert-links for local viewing -p gets all "page-requisites", e.g. images, stylesheets -np "no parent", means to avoid following links to levels above the starting point. -H Enable spanning across hosts when doing recursive retrieving.
Re: website archiver
by Arunbear (Prior) on Jan 13, 2009 at 07:26 UTC
Re: website archiver
by Anonymous Monk on Jan 13, 2009 at 07:24 UTC