in reply to Re^3: PerlMonks Caching (data)
in thread PerlMonks Caching

Ah, multiple within-node updates. So easy to spot...

But that version must have a whole transaction and redirect and memcached operation in between that X blocks. of course, it's still possible, but extremely improbable.

Yes, the lopsidedness of the number of operations that must race through vs the number of operations that constitute the "hole" can be a big help. And I expect that most systems would not manage to trigger the above race.

But I've seen several different ways where parts of PerlMonks get extremely sluggish. So I'm more concerned than most about some very sluggish process throwing out a very delayed update.

Could maybe be prevented by putting the read and creating the cache entry into a transaction with read lock?

Did you really propose that? :) Either the race is never going to happen or there are cases when an update takes many times longer than it really should. You want to hold a read lock for an extended period? The solution I proposed boils down to a single strcmp() in the server.

That reminds me of seeing notes about fixing concurrency issues in memcached's use of threads when I was looking through some release notes. Part of the beauty of memcached was that it didn't have concurrency problems because it had such a simple and elegant design. Both sad and funny to see that get lost...

I would've allowed memcached to make full use of multi-core CPUs without using any mutexes at all (for the core features). You just continue the existing design one layer further. In this server process you have N sets of completely independent data structures and N threads. Each thread only ever deals with one particular data structure. You pick which thread (and thus which data structure) to pass off a request to exactly like how you pick a memcached server from a cluster: by hashing the key involved.

I'd actually push this abstraction out to the client so that a client having a cluster of 2 machines, each with 4 cores, behaves just like it has a cluster of 8 servers, except that 4 of those servers actually share a single IP address / port.

When a request comes in to the service, the main thread hashes the key and hands the connection off to the appropriate thread. Even the hand-off doesn't require a mutex because you can use a round-robin queue with only a single thread enqueueing and a single thread dequeueing.

You completely avoid the problems of forgetting to lock something (leading to races that cause corruption), then putting a lock in place but it being too global of a lock and greatly reducing concurrency, then splitting the lock into several more-localized locks and eventually having so many locks that it becomes non-trivial and then nearly impossible to totally grok the order in which locks might possibly be acquired and you reach the ultimate stage of having deadlocks.

I've seen that cycle way too many times. I wish more people would learn more from it. :)

- tye