in reply to Re^5: Randomization as a cache clearing mechanism
in thread Randomization as a cache clearing mechanism

I looked at memcached quite a while ago (seems like 2 years but probably hasn't been that long). My immediate impression was that they needed to get rid of the biggest race condition in their design.

I wrote the authors and they noted that they had gotten the same request (support a revision field in the data and prevent old data from overwriting new data). It appears that they still haven't managed to implement this simple and, IMO, important idea.1

Without that, I wouldn't use memcached for the more difficult PM caching problems, that is, things that get updated frequently such as the CB.

Of course, adding this to the open source project would probably not be that difficult of a task. And memcached could still be useful for PM without it.

- tye        

1 Preventing old data overwriting new data still allows for races, but it is much better and is the best PM will ever do.

memcached should also be fixed to support optimistic locking which can be race-free but any update might fail and have to be presented to a supervisory function (usually the user) to figure out how to handle the failure (and such for PM would suck more than the limited race conditions possible without going to optimistic locking).

To support optimistic locking, memcached would need to allow updates that say "here is the new data and it replaces revision X" and cause the update to fail if revision X isn't the current revision (and do updates atomically, which shouldn't be hard and I think is already the case based on my understanding of their design).

But in any case, memcached is in serious need of a way for it to track revisions (such as a version number like PM's node cache uses or a timestamp -- I'd just support any byte string where a simple 'cmp' determines which is newer and possibly also support variable-length digit strings).

  • Comment on Re^6: Randomization as a cache clearing mechanism (races)

Replies are listed 'Best First'.
Re^7: Randomization as a cache clearing mechanism (races)
by kappa (Chaplain) on Nov 21, 2004 at 09:53 UTC

    Hm. The memcached API has five principal commands: get, add, set, replace & delete. Distinct add which fails in case the key is already present in the cache along with a deletion delay helps prevent races (not completely). I think that developers' intention is to avoid introducing locks or versions by all costs.

    And yes, I wouldn't use memcached in a banking environment, it seems to be a MySQL-type product -- speed ahead of reliability.

    Thanks for bringing it up. Doing things which will not do much harm to humanity in case of failure tends to shift priorities :)

      speed ahead of reliability

      In fact, if a cache were to handle concurrency and data integrity perfectly, it would be pretty durn close to being an RDBMS, which would be much beside the point.

        Supporting optimistic locking would be easy and fast and doesn't come close to being a database (which must support pessimistic locking which is where all of the difficulty comes in).

        The changes I'm talking about just involve having certain updates immediately fail. Nearly trivial changes that fundamentally change how reliably memcached can be used.

        I believe it already has transactions for a single object always route to the same server where they are handled in a single-threaded manner. So there is very little left to fix.

        - tye