I think the easiest solution sounds like it would be to pull the master file into a database. This may sound daunting, but DBD::SQLite is very easy to install and use, and it creates a database file that is pretty much standalone. You don't have to install a heavy-weight database to use SQLite; it's self-contained.

Your initial conversion to the database might take awhile. But it won't take 24 hours! Depending on your hardware and the size of your total file I doubt it could exceed a few hours. And your queries will be MUCH quicker.

Your current solution is running in O(n^2) time, if I'm not mistaken. That's fairly inefficient. To speed things up, you need to know the start points of each record in the file so that you can quickly jump to that record. The easiest way to do that is to let a database deal with the mechanics for you. But other solutions would be to create an index file that could be pulled into a hash of indices and offsets so that you can quickly seek to the proper location in the master file. Or you could forgo the offsets if your master file uses a uniform record length, in which case your index hash could contain the indices and the record number.

But those solutions are really just you implementing your own version of a database. Since that's already been done, you may as well take advantage of what's already available. SQLite is an ideal solution for lightweight database work.


Dave


In reply to Re: How to cut down the running time of my program? by davido
in thread How to cut down the running time of my program? by kayj

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.