in reply to Re^2: Scripts to recursively reading in HTML files
in thread Scripts to recursively reading in HTML files

Then it looks like dragonchild's suggestion is the best bet. Also, I have found URI useful for building/rebuilding relative links.

On that last point, perhaps using absolute addresses (starting with /) might help?

In any event, if you're parsing HTML, _don't_ try to do it with regexs!

If anyone has heard of a script, I'd like to see it!. Best of luck.