hallikpapa has asked for the wisdom of the Perl Monks concerning the following question:

I was looking around cpan for modules that would accomplish what I am looking for, but need a bit of advice. I have a script that queries a mysql db and returns the data into a web browser. Some of the queries can return thousands, to tens of thousands of records. I would like to write this data to a temporarily cache on the system. besides the data samples being so large, I would like the user to be able to sort columns and type in a "filter box" to slim down the results, but instead of querying the DB again, to sort and filter based on the cache. I saw Cache::FileCache but there wasn't much documentation and I saw in other threads it's very slow. I was hoping to find something well documented and possibly small examples to help me understand better. Thank you for any suggestions on how to accomplish this task.
  • Comment on Best way to cache large amount of data?

Replies are listed 'Best First'.
Re: Best way to cache large amount of data?
by dragonchild (Archbishop) on Oct 30, 2007 at 00:18 UTC
    Turn on MySQL's query cache. This is the kind of thing that databases are designed for. Especially if you're using proper LIMIT clauses and have your query properly indexed.

    Alternately, if you're insistent on taking it out of the DB, take a look at DBM::Deep.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      I read a bit about the query cache in mysql, but these tables are constantly updated. Anywhere from 10-16 million records a day, so if someone issued the same query again, it may pull up older data instead of the most current. I will check out DBM::Deep. I suppose it could go either way. Send more parameters to the database so it sends the same query everytime, or use a perl mod.
        Well, if the intent is for a user to be able to sort the same data they just saw, then it makes sense to be able to see the same data each time. Another possilibity is to use something like DBM::Deep to pull the data you want to show, then build your view from that intermediate cache.

        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Best way to cache large amount of data?
by moritz (Cardinal) on Oct 30, 2007 at 07:51 UTC
Re: Best way to cache large amount of data?
by oha (Friar) on Oct 30, 2007 at 11:43 UTC
    I personally think that you do not need a global cache.

    this data is session data, and you should keep it in it. 10 thousands record of about 200 bytes each is 2Mb. If you are planning to serve hundreds of user per time, this is a viable solution.

    iff you fear the memory will be growth too much (and you can't shorten the session timeout anymore), then you can use Tie::Cache but using as hash key a query id. every time a user issue a request, the data is stored in this cache with a unique query id, and the id is set on the session of the user. if the user request the same query id, then you'll use the cached one.

    Oha