in reply to Re^2: Efficient way to handle huge number of records?
in thread Efficient way to handle huge number of records?

That is not true. the SQLite is the fastest DB engine I ever come across. You just need to increase the buffer size of the read input to let say 4 MB within a transaction and you will see that it can import the above values with no problem under a minute where MySQL will take much longer. And since it stores everything in a RAM query time is going to be again much much more faster. So if you or anyone is looking for a fast DB engine without some fancy-shmancy features that i rarely use anyway the SQLite is the way to GO.

So if you need a db engine that is fast and reliable and can deal with lots of data you will want SQLite.

Now as far as the initial question goes, you can do something similar to what MySQL does. Yu could split the file into chunks and index chunks by the line numbers so that you know in which line does the header of you sequence appear. Once you did that you need to hash only those indexes. This will reduce the search the number of times prop. to the number of fragments you have after chomping your initial

  • Comment on Re^3: Efficient way to handle huge number of records?