Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

If I've a set of values that needs to be accessed frequently e.g. (total number of questions from various topics), would it be better to get these values from the database (using SELECT COUNT(*), assuming the questions are stored in a database) or would it be better to have these totals saved separately in Perl Storable and retrieved from the stored hashref?

In terms of efficiency and speed, which way is preferable?

Thanks in advance :)

Replies are listed 'Best First'.
Re: Database or Storable?
by GrandFather (Saint) on Mar 09, 2009 at 09:00 UTC

    Not enough information. How many items of data? Does the information need to persist between runs of the application? How often per second do you need to access the data? Do you access the items randomly or do you need it all at once? How often does the data change? Does the process reading the data also change the data. Is all the data changed at once or are random items changed?


    True laziness is hard work

      Thanks, and my apology for not offering sufficient details.

      The questions are stored in the database (MySQL). The total statistics need to be accessed every time the web page is accessed, because they are displayed e.g. (Maths -> 3000 questions, English -> 2050 questions, etc). The total statistics don't change frequently, only when new questions are added or when existing ones are deleted (but that doesn't happen so often). Speed of access is important - the faster, the better.

      I've saved the total statistics in Perl Storable. I'm guessing it's more efficient to get these values from the stored hash reference than querying the database. But I'm not exactly sure.

        The conventional wisdom is: profile it or bench mark it. Maybe you can do that with an off line version of the data and a test script?

        The gut feeling is: the database is probably faster if it's a simple query. Maybe it would be worth adding a table to cache the stats that are referenced often, although without knowing the current structure of your database and how the totals are obtained it's hard to tell.


        True laziness is hard work

        How much of the total dataset is required by any given access to the web page?

        If, for example, only the Math questions are required by any given load, consider splitting the Storable datset into 2 and only loading the required set once you know which is required. You'll need to do a directory lookup anyway, and if you can defer that lookup until you've acquired the information telling you which half you need, the page will load more quickly and you'll only need to load half the data when you do.

        If there is a natural split that reduces the amount of data further, try that also.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Database or Storable?
by Corion (Patriarch) on Mar 09, 2009 at 09:08 UTC

    In addition to GrandFathers good questions, also consider how the query requirements will change. If you envision only a limited set of queries it might be convenient to keep the data in a Perl data structure. The limited set would be restricted to simple counts all things, or hierarchical counts of things that fit within the hierarchy of your data structure.

    If you can think of nifty reports on things in your data, or don't even know what other things might be interesting to report on, SQL is far more expressive than Perl for writing reports, at least as long as your data fits into a flat data structure.

Re: Database or Storable?
by moritz (Cardinal) on Mar 09, 2009 at 09:08 UTC
    In general I wouldn't dump the counts to a different location, because it would have to be kept synchronized, which is error prone.

    If you have an index over the primary key of your table (it likely works for other columns as well) then a SELECT COUNT(*) will simply look into that index, so the DB already does the optimization you're looking for.

    I'd only revert to such means if I ran into performance problems that make them absolutely necessary.

Re: Database or Storable?
by zentara (Cardinal) on Mar 09, 2009 at 11:33 UTC
    There is a comprimise between database and Storable, which you may be overlooking. It's SQLite.....it runs as a standalone program (no db server needed), but allows sql like statements....which is a big plus over Storable.

    I'm not really a human, but I play one on earth My Petition to the Great Cosmic Conciousness
Re: Database or Storable?
by locked_user sundialsvc4 (Abbot) on Mar 09, 2009 at 13:35 UTC

    I did a couple of very nice web-apps recently where the notion of “storing Perl objects” worked out extremely well. The objects are instantiated early in the process, and they then remain “persistent” from one page to the next. At the successful completion of a task, the data in the objects is transcribed back into the underlying database.

    I find it to be definitely-true that real world apps often don't conform to “the CRUD scenario.” They aren't thinking in terms of the underlying database structures, whatever they might be, and furthermore they really don't want to be constrained to them or by them. And neither do you, as a programmer or a designer.

    Furthermore, people like for apps to have a clear and definite “Pinnochio point...” a single well-defined moment when what they have begun to do finally “becomes a Real Boy.” (Until, and unless, that moment actually arrives, nothing permanent has happened, and they know it.) They want to be able to approach that point however they like, using the browser's “Back” button with reckless abandon and knowing that it will always work out right in the end.

    Working with the idea of “storable, persistent objects” makes that sort of thing very easy to do. The temporal state of the transaction is represented in the stored object; the permanent state of the world is (eventually) reflected in the main database(s).

      The temporal state of the transaction is represented in the database, too. Just that it happens to be in the Session table, which you access via CGI::Session. Makes maintenance easier, too, IMO. However CGI::Session wants to store the data is its problem. (Well, mostly - you can choose the approach, which could be via a Storable blob.) But the key point is that you don't need to track a bunch of files, ensure they're cleaned up, etc. The session manager can do that for you.