in reply to Re: PerlMonks Caching
in thread PerlMonks Caching

I've been asking on where the problems are several times. It would be useful to run something like munin and look at a statistics on which pages are retrieved most. It's better than making a wild guess.

Well, I wasn't guessing wildly. Page load numbers were even public information until they got broken (by a mysql update, I think -- I don't have those details swapped in at the moment).

And I've posted public nodes about the problems that have been identified, how many of them were fixed, and how they get rebroken. It is too bad that you seem completely unaware of them. *shrug*

Update: Sorry, I'm sure that sounds harsh. Let me echo Corion's best reply. Your help and support is sincerely appreciated! Very little in PerlMonks admin is instantaneous, far from it. I wouldn't rate memcached as the first priority but I'm also convinced it would be an improvement (except for the race condition which might hardly ever be noticed or might be quite an annoyance that other users of memcached don't run into).

- tye        

Replies are listed 'Best First'.
Re^3: PerlMonks Caching (data)
by tinita (Parson) on Apr 21, 2010 at 18:13 UTC
    except for the race condition which might hardly ever be noticed or might be quite an annoyance that other users of memcached don't run into
    I didn't get that race condition anyway. Especially the "flush to memcached".
    What i'm doing in my forum at the moment is storing a whole thread in memcached as a data structure. Expiry is set to 3 minutes (could also be more, just because I'm storing also user info and signatures, it's set to 3 minutes, so that user info changes get updated faster).
    Now, whenever somebody posts a new node or updates their node, the cache entry is explicitly expired immediately. Database transaction, after that delete thread cache entry, then redirect the user to the updated thread or node. And the request for the thread then reads from the database and creates the cache entry again.
    Your use of memcached seems a bit different, if I understood that correctly? So the wrong flush to memcached wouldn't happen in my case.

    edit: yes, I think that's the point - in the second block of server X you are flushing N2 to memcached, but the current version in the databse is N4. If creating the cache entry only when fetching a thread the RC shouldn't be there.

    edit 2:
    or maybe I can still create an RC:

    X: reads thread with node N1 X: votes N1 -> N2, delete cache entry X: redirect after POST and reads N2 from DB... Y: author updates node from N2 to N3 Y: delete cache entry, if exists Y: redirect after POST and reads N3 from DB Y: cache N3 to memcached X: ... cache N2 to memcached
    But that version must have a whole transaction and redirect and memcached operation in between that X blocks. of course, it's still possible, but extremely improbable. Could maybe be prevented by putting the read and creating the cache entry into a transaction with read lock?
      I didn't get that race condition anyway.

      s/get/notice/

      Different? Yes. Lacks the race? No.

      Y: request arrives when Alice updates her node, N (having noticed a typo that makes her node appear extremely rude)
      Y: delete N from memcached

      X: request arrives when Bob downvotes looks at Alice's node, N
      X: N found missing from memcached
      X: read version 1 of N from memcached DB, N1
      X: decrement 'reputation' field in N1, producing N2
      X: flush N2 changes to DB (update ... rep=rep-1 ...)
      X: re-read N from DB, yielding N2 again
      X: the slowness of this web server matters at this point

      Y: read version 1 of N from memcached DB, N1
      Y: apply update to node text, producing N3 N2
      Y: flush N3 N2 changes to DB (update ... doctext='...' ...)
      Y: redirect to display node as response to update
      Y: re-read N from DB, yeilding N4 N2 (includes both the text update and the reputation decrease)
      Y: flush N to memcached, storing N4 N2

      X: flush N to memcached, storing N2 N1 (Oops!)

      In this scenario, Alice sees her update applied while nobody else does. If Alice refreshes, then her update mysteriously vanishes. With the 3-minute expiry, her update mysteriously appears again due to the cache timing out rather than when the next update is done.

      Other than the race condition (which should just be fixed), I don't see the point in expiring the cache so frequently. Let the LRU do its job.

      Your scheme seems more complicated (and less efficient) yet doesn't remove the race.

      - tye        

        Other than the race condition (which should just be fixed), I don't see the point in expiring the cache so frequently. Let the LRU do its job.
        but in your version you're also updating the cache when updating a node. so it's the same work actually.
        less efficient? I don't think so. Reads happen much more often than writes.
        More complicated? I don't think so. If I update something i simply delete the cache entry. Then let the next read create the cache again and don't care. That's even less complicated because in the read version I create my data structure how I want it. When I have to create the cache entry when updating I have to do the same work, although at that point I'm not reading the whole thread.
        Like I said, I'm caching the whole thread. When updating a node, the whole thread cache entry is deleted. The next read reads the nodes from db, puts them in my wanted data structure and then a) caches that and b) passes it to the template. If I had to create the whole thread data structure in the update before the post, it would seem more complicated to me.
        update: forgot the "pass to template"
        so, we're talking about a race condition which might display outdated things in certain cases but doesn't break any data in the database.

        btw., I think I like my version of using memcached more probably because it acts like a cache - requesting thread from cache - in cache ? display : get from db. actively writing all updated things to the cache is more like pushing.

        For example, I have a cache of the overview page (displays the newest node of each of the sub boards), the recent 24h threads (displays a list of threads updated in the last 24 hours) and the cache of the threads themselves. When updating a thread, I delete just these three cache entries and I'm done. Following your strategy to recreate all those three entries just when updating seems to much work for me, especially if you don't know if all of these things are actually requested in the next minutes or not. Only creating when needed seems more natural to me.
        And about what to cache in general, my first thing to cache here on perlmonks would be the newest nodes page, and then the RAT page. For those pages the race condition is even less important, especially it the cache just lasts for three minutes.

      Ah, multiple within-node updates. So easy to spot...

      But that version must have a whole transaction and redirect and memcached operation in between that X blocks. of course, it's still possible, but extremely improbable.

      Yes, the lopsidedness of the number of operations that must race through vs the number of operations that constitute the "hole" can be a big help. And I expect that most systems would not manage to trigger the above race.

      But I've seen several different ways where parts of PerlMonks get extremely sluggish. So I'm more concerned than most about some very sluggish process throwing out a very delayed update.

      Could maybe be prevented by putting the read and creating the cache entry into a transaction with read lock?

      Did you really propose that? :) Either the race is never going to happen or there are cases when an update takes many times longer than it really should. You want to hold a read lock for an extended period? The solution I proposed boils down to a single strcmp() in the server.

      That reminds me of seeing notes about fixing concurrency issues in memcached's use of threads when I was looking through some release notes. Part of the beauty of memcached was that it didn't have concurrency problems because it had such a simple and elegant design. Both sad and funny to see that get lost...

      I would've allowed memcached to make full use of multi-core CPUs without using any mutexes at all (for the core features). You just continue the existing design one layer further. In this server process you have N sets of completely independent data structures and N threads. Each thread only ever deals with one particular data structure. You pick which thread (and thus which data structure) to pass off a request to exactly like how you pick a memcached server from a cluster: by hashing the key involved.

      I'd actually push this abstraction out to the client so that a client having a cluster of 2 machines, each with 4 cores, behaves just like it has a cluster of 8 servers, except that 4 of those servers actually share a single IP address / port.

      When a request comes in to the service, the main thread hashes the key and hands the connection off to the appropriate thread. Even the hand-off doesn't require a mutex because you can use a round-robin queue with only a single thread enqueueing and a single thread dequeueing.

      You completely avoid the problems of forgetting to lock something (leading to races that cause corruption), then putting a lock in place but it being too global of a lock and greatly reducing concurrency, then splitting the lock into several more-localized locks and eventually having so many locks that it becomes non-trivial and then nearly impossible to totally grok the order in which locks might possibly be acquired and you reach the ultimate stage of having deadlocks.

      I've seen that cycle way too many times. I wish more people would learn more from it. :)

      - tye