Are the documents being updated through a web interface that you have control of as well? I.e., can you do something that is driven off of the update event rather than have to poll the last updated time? If so, you could do a rebuild on document update (with a mechanism to avoid multiple concurrent updates).
If you are forced to poll file timestamps to determine when to update, one possibility would be to use the first approach you mention above (updating on demand when a search is requested) with some modifications:
1. Whenever you rebuild the cache, store the fact that a cache rebuild was initiated at such and such a time. After successful completion of the cache rebuild, store the time that the cache rebuild started.
2. When a search occurs, you can compare the file timestamps to the cache rebuild start timestamp. If any file is newer than the cache rebuild start time, a rebuild is needed.
3. To avoid having to do step 2 very often, you can also record the last time you did step 2 and only do it again after some fixed time elapsed (time-to-live).
4. You would need some mechanism to avoid running multiple cache rebuilds concurrently, but you also need a way to prevent that mechanism from locking out all future cache rebuilds if a cache rebuild failed part way through.
5. The user that caused a cache rebuild could be returned results from a search against the old keyword cache, so that he doesn't have to wait for the rebuild to take place (if that is acceptable).
6. You might also need a mechanism for preventing step 2 from running multiple times concurrently.
This would be much easier on an operating system that had a reliable task scheduler like cron.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.