What you are looking for, I believe, is something to spider
a site similar to perl monks or slashdot and store it as
it is currently in flat files? You should look into
LWP or even the wget command to download the pages of a site recursively (spider).
Of course, this doesn't work with forms and such...
There was a thread on something similar to this a while ago... here
- Ant