in reply to Re^3: fast disk db with bulk insert, fast read access, compact storage
in thread fast disk db with bulk insert, fast read access, compact storage

Hm. Not sure where in the OP you get the 'dial-by-name' idea from.

With 1GB of keys averaging 8 chars, that's 128 million key/value pairs. And 31GB / 128MB = ave. 246 char values, which is a bit big for a telephone number.

This bit of the OP seemed quite clear to me, hence my Google example:

I want to do real-time search as my users are typing words.

But I guess unless the OP comes back and clarifies, we won't know if I got it right or not.

I'm currently playing with an indexer, that indexes each record by each character and position in the keys. I project it would take 84 minutes to index the 32GB; and produce a count of matching records within 50 milliseconds. That's from disk with a cold cache. Should be substantially faster using an SSD.

For the described dataset, it would use 8GB of primary index and 1GB of secondary; which puts in the ballpark of the OPs requirements. Assuming that I read them correctly.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy
  • Comment on Re^4: fast disk db with bulk insert, fast read access, compact storage

Replies are listed 'Best First'.
Re^5: fast disk db with bulk insert, fast read access, compact storage
by Marshall (Canon) on Sep 15, 2010 at 19:10 UTC
    With 1GB of keys averaging 8 chars, that's 128 million key/value pairs.

    Yes, that is the key. I guess pun intended here. You enter one char and the field of possibilities narrows, you enter the second char and the field of possibilities narrows further...Goal being to reach a unique key ...

    Sounded to me like all that needs to be indexed is the 1GB or ~128M keys. Actual data storage can be on some HD.