As it stands, I'm developing this on a single core. The computer crashing hasn't been an issue for me but in order to minimise on repetition of work the state of the crawl is saved to disk each time the memory is filled and so, proportionally, not that much work would actually be repeated should my little box ever decide to crash. I do take your point, though, and will experiment further with writing to disk on the fly. I'm not sure what sort of optimisations you would propose to make writing quicker. As far as I know, in general, the only way of making writing to disk quicker is to attempt to write as much data in one go and to make those writes to consecutive space (not really possible for a hash table). Anyway, I'm not interested in duplicate content because I don't even process the content. The goal is to create a map of links on the internet. Whether there are a number of different roads that lead to the same location at this point does not concern me. What concerns me is to exhaustively map those roads. So, that brings us back to what my real present problem is. Making the best use of bandwidth available.