Re^4: fast disk db with bulk insert, fast read access, compact storage

Hm. Not sure where in the OP you get the 'dial-by-name' idea from.

With 1GB of keys averaging 8 chars, that's 128 million key/value pairs. And 31GB / 128MB = ave. 246 char values, which is a bit big for a telephone number.

This bit of the OP seemed quite clear to me, hence my Google example:

I want to do real-time search as my users are typing words.

But I guess unless the OP comes back and clarifies, we won't know if I got it right or not.

I'm currently playing with an indexer, that indexes each record by each character and position in the keys. I project it would take 84 minutes to index the 32GB; and produce a count of matching records within 50 milliseconds. That's from disk with a cold cache. Should be substantially faster using an SSD.

For the described dataset, it would use 8GB of primary index and 1GB of secondary; which puts in the ballpark of the OPs requirements. Assuming that I read them correctly.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

RIP an inspiration; A true Folk's Guy

Comment on Re^4: fast disk db with bulk insert, fast read access, compact storage

Replies are listed 'Best First'.
Re^5: fast disk db with bulk insert, fast read access, compact storage by Marshall (Canon) on Sep 15, 2010 at 19:10 UTC
With 1GB of keys averaging 8 chars, that's 128 million key/value pairs. Yes, that is the key. I guess pun intended here. You enter one char and the field of possibilities narrows, you enter the second char and the field of possibilities narrows further...Goal being to reach a unique key ... Sounded to me like all that needs to be indexed is the 1GB or ~128M keys. Actual data storage can be on some HD.	[reply]

Replies are listed 'Best First'.

Re^5: fast disk db with bulk insert, fast read access, compact storage
by Marshall (Canon) on Sep 15, 2010 at 19:10 UTC

With 1GB of keys averaging 8 chars, that's 128 million key/value pairs.

Yes, that is the key. I guess pun intended here. You enter one char and the field of possibilities narrows, you enter the second char and the field of possibilities narrows further...Goal being to reach a unique key ...

Sounded to me like all that needs to be indexed is the 1GB or ~128M keys. Actual data storage can be on some HD.

[reply]