in reply to Speed searching HTML docs

One thing I'm missing in this scenario is how you deal with files that are deleted. As you describe it now, it might be a search returns files that are no longer there.

If I had control over the publishing system, it's easy. Whenever a new file is added, a file is modified or deleted, you have to update your index.

Otherwise, I'd run a sceduled process (for instance from cron or whatever Windows uses). It should take the list of indexed files (with their timestamps) and compare them with the files and timestamps on the systems. All differences need to be reindexed. If this dies once in a while halfway, the next time the sceduler fires the process changes will be picked up.

You also may want to completely reindex the site every night or weekend.

Abigail