As others stated: Binary Search is the key to speed! I assume you know about binary search algorithms. I think, you need not fill the keys to get a fixed record length.

As you already have a sorted file, simply seek to the files center by dividing its size in bytes by 2. Then read one line. This one is to be discarded as it is most surely incomplete. Get the current fileposition as your corrected value for the file's center and read the next line in order to find out whther you hit the correct key or whether you have to advance to the upper half or go back into the lower half.

again you will always have to read one line and discard it.

You might miss one line using this approach:

1230 value AAA 1235 value AA 1240 value A
imagine, your lower bound is 1230 and your upper bound was 1240. Getting the center of those 2 might land at "235 value A". This line will be discarded. advancing one line will hit your upper bound.

in such a case you will have to read all the lines from lower to upper bound to check for a hit.

This is just a rough idea i've never tested!


s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e

In reply to Re: fast lookups in files by Skeeve
in thread fast lookups in files by citromatik

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.