in reply to string arrays

Consider a design that allows the strings to reside somewhere more commodious than RAM.

100 million strings, even if they're only 4 bytes wide, is about 380meg (assuming pretty much zero overhead).

If you'll be doing much with the strings, that's too much for a "flat file". I mean sure, you could literally store 380meg in a flat file, but access would be slow and forget about writing to the 78,294,262nd string. A database solution is a good way to go for its random access features and scalability to such large datasets.


Dave


"If I had my life to do over again, I'd be a plumber." -- Albert Einstein

Replies are listed 'Best First'.
Re: Re: string arrays
by Anonymous Monk on Oct 15, 2003 at 12:20 UTC
    Sorry about that. I work with large text and am working on implementing suffix arrays in Perl. This is why I need to store so much data. The entire text needs to be stored in memory. I have tried the database approach using BerkeleyDB and DB_File both are very nice but kill me on IO. It simply takes to much time (weeks). I can convert the strings to integers and then store them in a vec. This seems to be working okay so far but I was curious if anyone had a better solution. Thanks for the quick responses, I didn't expect to get so many this morning!!