leocharre has asked for the wisdom of the Perl Monks concerning the following question:

I often have a module that might serve as an interface to a record stored in a database. For example a 'person' object- which might hold vcard data, some info about them in relation to a company- etc.

So internally I have a load mechanism for the data from db.

I store it in the object until a save mechanism is called to commit changes if any, etc. So essentially the data is sort of cached in the object.

This could be done instead, as a direct call to the database. That is.. if you call for the 'person' object's 'name' method.. instead of something like (pseudocode):

sub name { my $self = shift; $self->{name} ||= $self->dbh->prepare('SELECT FROM ... ... $self->{name}; }
You would do something like..
sub name { my $self = shift; $self->dbh->prepare('SELECT FROM ... ... $result; }
So, in essence, we are making calls for everything, every time.

I would think the lag time would be apocalyptic- and the whole idea abusive to the db server.

Am I wrong? Does anybody have experience fetching and storing data via an api, directly? Without caching stuff in the object.. ?

Seems that it could be more convenient to code this way. Maybe a little more risky?

Replies are listed 'Best First'.
Re: not caching to object in code interface to database?
by kyle (Abbot) on Jan 22, 2009 at 20:14 UTC

    One problem with a cache is if two parts of the program have objects for the same record at the same time. They can fiddle with their local copies and think that they've made the changes they want to make to the database, but the last one to commit wins. Or maybe the record winds up in a strange state. One of them may make a change, and the other never sees it.

    If you're planning to cache things at all, you might as well cache entire records at a time. When the object needs any field, get every field. That way you hit the database fewer times.

    If you don't cache anything, the performance may indeed be really bad. It might be bad but acceptable—that's a personal preference. Have you tried it?

      Yeah.. good point on multiple objects. I have it so they're singletons sometimes. They can be locked for commit, etc..

      I also have it as you mentioned- I fetch all on load triggered and save all at the same time (please excuse my pseudocode). There are some things, that if they are heavy, I hold off on until they are called.

      No I really haven't tried it.. I'm curious though. I bet this is wildly different in various db daemons, etc. For example, if the server is local, obviously if you are using autocommit, etc etc etc.

      It just felt creepy wrong to do it that way- but.. maybe the tech is up it these days.. hmm..... What's really cool is that you actually wouldn't be changind the api at all- most of it.. just the internals. So, I could take one of my objects, rename it and rework it to act that way.. hm.. more to come about this sooner or later.

Re: not caching to object in code interface to database?
by perrin (Chancellor) on Jan 22, 2009 at 21:58 UTC

    First, I would encourage you to look at Rose::DB::Object, since this looks awfully familiar.

    The caching you're describing is a pretty common practice. It does have risks, in that the data in memory is not guaranteed to reflect the database unless you grab an exclusive lock on it (e.g. SELECT...FOR UPDATE). There has been some writing on the subject in the past, since this has been the way things are done for decades. I think I once read something by Fowler about it.

    The gist of it is that most programmers copy database state into memory for at least short periods of time in order to make it easier to deal with. Then they write it back out. Many millions of hours of effort have been put into making this process easier and/or hiding it from client code. You're on a similar track and you're not wrong to do it this way.