rjahrman has asked for the wisdom of the Perl Monks concerning the following question:
I need to store a whole lot of binary data . . . possibly 100GB or more of data in 100,000+ files. I need to be able to search through one of the files in such a way that it takes 4 bytes, checks for a match, take the next 4 bytes if it's found it, or if not skip 4 bytes and repeat. I figure that I have two options to store this data:
1) Use Binary Files. I could have each set of binary data in its own file, and have all 100K of the files in one directory. I would open a file and read in the data n bytes at a time. The benefit is that I would only have to read as far into the file as I need to, but my filesystem would crap out with that many files in a directory. To get around this I could make 256 or so directories and put the files into different directories.
2) Use a mySQL Database. I could make a table with two columns. The first would be the identifier (otherwise the filename), and the second would be the data that would have been in the file. The ID would be a primary key, so it would be fast. I know that mySQL can handle 100K rows. It also wouldn't waste the left-over space that file size blocks take up (e.g. round up the size to the nearest 4KB). However, even if the script or program is on the same computer, wouldn't it be slowed down by needing to access and store in RAM as much as a 100MB BLOB all at once, even though it would often only need the first kilobyte (or less!) of that?
Any suggestions? BTW, I may or may not do this in Perl. (Opinions on that? I assume C would be faster...) I just posted here because I couldn't think of a good mySQL forum with enough traffic to get an answer for this type of a question.
Thanks in advance!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Huge Table of BLOBs or Binary Flat-File Database?
by jfroebe (Parson) on Jun 14, 2004 at 04:03 UTC | |
|
Re: Huge Table of BLOBs or Binary Flat-File Database?
by graff (Chancellor) on Jun 14, 2004 at 05:07 UTC | |
|
Re: Huge Table of BLOBs or Binary Flat-File Database?
by jZed (Prior) on Jun 14, 2004 at 03:55 UTC |