Just fyi, you should definitely check out Plucene if you haven't yet. Even if you can't use it directly (since the binary format of the index is not the same as Java Lucene), the code may be useful since they presumably have to solve the same problem you do.
Good advice. I've been heavily involved in the Plucene project. This is like Plucene, but the indexer, at least, is 13 times faster, and it writes true Lucene-compatible indexes. I'm working on the search code.