in reply to Rapid text searches
I assume it's more complicated than in your example and gaps are possible, so you can't just trivially calculate the right lines to look for?
And a binary search is either no option?
With a strict ordering, it's like a telephone book, no need to start reading from the first page on, first look for the right city and the starting letters.
I would build up a hash of hashes for indexing ranges of line numbers for each column.
with 1101077781160 you'll look up $hash1{11010} where you get a second hash to lookup $hash2{77781} telling you the range to search for "160".
The firstlevel hash (the city) should have a size thats easily kept in RAM, the second level hashes should be loaded on demand (and some - maybe the last hundred - kept cached).
Of course you could have more levels of hashes and I'm not sure about the best way to make hashes persistent and quickly loaded, but this are CPAN-details.
Cheers Rolf
|
|---|