johnnywang has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm seeking advice on the following task: in the linux/apache/mod_perl setting, my app needs to get a single record from a rather large database table with a pretty complicated query, and update that record upon user's action, the requirement is that each request should get a different record.

My first version was very simple: just run the query each time, and return the first row. This has two problems: it's slow, and two request can get the same result back if accessed simultenously.

My second version built an cache (an array of records) in shared memory using IPC::Shareable, so each time I just shift the array, and query once when the cache is empty. This has been working fine for awhile. But I didn't like the shared memory solution, one reason is that my test script always run out of semaphores (ok in deployment since there were only a few apache processes), the other reason is that I'd like to have other apps, probably from a different machine, to have access to the same data.

So I'm working on my third solution: to create a separate TCP server, that serves as a global cache. I'm avoiding using SOAP-like mechanism, since that's not fast enough for me. I'd like a client (say the web app) to keep the TCP connection open. This seems to be working fine now, based on my limited test scripts.

I guess my question is to ask your thoughts/comments on the whole task/solution, and whether there already exists such a solution there (too late for that? I'm willing to throw my code anytime there is a better one!) Thanks.

Replies are listed 'Best First'.
Re: A database caching scheme
by perrin (Chancellor) on Nov 12, 2004 at 19:10 UTC
    It's not quite clear to me what your real task is here (if it truly requires a complex query to get, why are you able to cache it?), but I can tell you something about the caching methods you're using. IPC::Shareable is pretty slow, in addition to the semaphores issue you mentioned. For something accessible frlom a cluster of machines, your best options are either MySQL or Cache::Memcached. A simple query against MySQL (single primary key lookup) is extremely fast -- much faster than anything you are likely to write in your own TCP server without a lot of effort. I would suggest you start there, since it's simple to set up and try.
Re: A database caching scheme
by terra incognita (Pilgrim) on Nov 12, 2004 at 23:15 UTC
    This is not a Perl solution, however it may be a better solution for your situation. Have you thought about caching the recordset not in memory but in the DB itself using a temp table? This would allow you to share the recordset, reduce network traffic, as well as offload some of the logic to the DB.
      I would agree with terra. In fact, if you're using oracle, you can create session-level temporary tables that will expire when the session disconnects:
      CREATE GLOBAL TEMPORARY TABLE foo ( bar VARCHAR2(16), baz DATE ) ON COMMIT PRESERVE ROWS
      Unfortunatly, I don't know about other RDBMS's such as Postgres or MySQL. Hope that helps

      --
      Ben
Re: A database caching scheme
by fglock (Vicar) on Nov 12, 2004 at 19:46 UTC
      If you really want to spend a lot of time making something much slower than MySQL or memcached, this would work great. Otherwise, save yourself the trouble. Cache::Cache is not fast.
Re: A database caching scheme
by johnnywang (Priest) on Nov 15, 2004 at 08:11 UTC
    Thanks for the replies. I'm using MySQL. The query is only complicated in that the "where" clause can be on some columns that are difficult to index, and sorting is involved, so it takes sometime to return the results. But once a record is retrieved, there is a primary key, so update can be easy. The problem is that the particular table is growing bigger and bigger (tens of millions), all I really want is to say "give me the next record satisfying these conditions", so really a big query/sorting to return only one record. A temp table with much few results and sorted might be a good alternative.
Re: A database caching scheme
by dba (Monk) on Nov 14, 2004 at 22:49 UTC
    What is your database? oracle/MySQL?
    If you have to use a complicated query to retrieve data, how come you can update directly?
    As already mentioned, temp tables might be the way to go stored on the database.