Sounds like you're indexing your data by a hex-encoded digest?
Given that you have 3 variable & possible huge sized chunks -- which most RDBMSs handle by writing the filesystem anyway -- associated with each index key, and your selection criteria are both fixed & simple, I'd use the filesystem.
Subdivide the key into chunks that make individual directories contain at most a reasonable number of entries and then store the 3 sections in files at the deepest level.
By splitting a 32-byte hex digest into 4-char chunks, no directory has more than 256 entries. The file-system cache will cache the lower levels and the upper levels will be both fast to read from disk and quick to search. Especially if your file-system hashes its directory entries.
I'd write the individual chunks of the two text parts in separate files unless they will always be loaded as a single entity, in which case it might be slightly faster to concatenate them.
Overall, given a digest of 8fbe7eb8c04c744406cca0aeb67e4f7f, I'd lay the directory structure out like this:
/data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/meta.txt
/data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text1.000
/data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text1.001
/data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text1.002
/data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text1....
/data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text2.000
/data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text2.001
/data/8fbe/7eb8/c04c/7444/06cc/a0ae/b67e/4f7f/text2....
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|