in reply to How to cut down the running time of my program?

I think the easiest solution sounds like it would be to pull the master file into a database. This may sound daunting, but DBD::SQLite is very easy to install and use, and it creates a database file that is pretty much standalone. You don't have to install a heavy-weight database to use SQLite; it's self-contained.

Your initial conversion to the database might take awhile. But it won't take 24 hours! Depending on your hardware and the size of your total file I doubt it could exceed a few hours. And your queries will be MUCH quicker.

Your current solution is running in O(n^2) time, if I'm not mistaken. That's fairly inefficient. To speed things up, you need to know the start points of each record in the file so that you can quickly jump to that record. The easiest way to do that is to let a database deal with the mechanics for you. But other solutions would be to create an index file that could be pulled into a hash of indices and offsets so that you can quickly seek to the proper location in the master file. Or you could forgo the offsets if your master file uses a uniform record length, in which case your index hash could contain the indices and the record number.

But those solutions are really just you implementing your own version of a database. Since that's already been done, you may as well take advantage of what's already available. SQLite is an ideal solution for lightweight database work.


Dave

  • Comment on Re: How to cut down the running time of my program?