I think this is great. I'd add the following features...

When you create the index, allow the user to specify "only appends expected". Record in the index the inode of the indexed file. When the file is re-tied, if the file has been modified more recently than the index file has been modified (alternately, store a timestamp in the index file rather than relying on the file system's timestamp for it), then the index might need to be recreated...

When recreating the index, if "only appends expected" is true and the inode of the indexed file hasn't changed, then seek to the character before the Nth-to-last line and verify that it is "\n". Repeat for the Nth last lines. If all of that passes, then just scan the end of the file starting at the last indexed line and add indices for any additional lines (and update the index's timestamp).

When grabbing the Nth line, if N > 0, actually read starting from one character before the Nth line and then verify that the first character is "\n". Verify that the last character is either "\n" or is the last character of the file. If either of these tests fail, reindex the file from the beginning. Strip the leading "\n" before returning the line, of course.

You could also index the file lazily. That is, if they ask for the Nth line, then only index up to the Nth line. Next week, they could tie the file, the module would note that nothing had changed, then they ask for the 2*Nth line and the module would then index the N+1st line through the 2*Nth line.

Yes, the checks for updates aren't bullet-proof and users need to be able to force a re-index manually, of course, but I think the logic I outlined could be very valuable.

Thanks,

- tye        


In reply to Re: Proof of concept: File::Index (detect updates) by tye
in thread Proof of concept: File::Index by davido

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.