isync has asked for the wisdom of the Perl Monks concerning the following question:

Title says it all, looking fo hints/pointer/examples...
This is database-related, and off-topic. Still, as this easily eats into the code/Perl realm when being implemented, I hope I get away with it. So fellow Perlmonks, in case anyone has tackled this before:

My database is basically one table that holds entities. Then, another table holding attributes, each attribute having its own id and a field that links it to the parent entity, then 'attribute value' and so forth. EAV.

Now, coming from a "relational mindset", I've got problems wrapping my head around how to do "snapshots" of how a view (= a rendition of the entity with all its attributes displayed) of one entity looked at a certain point in time. In order to be able to rollback, to diff, to document (audit?) the history of an entry (an entity) in my db.

I think one way of implementing this is to add some tracking of revisions to the attribute table (to what revision an attribute belongs). Roughly, what I can think of is:

  1. Add a revision field to attributes table, start counting at rev. one. On SELECT, constraint to only attributes having a certain revision (latest/specific rev.); when new attributes are added, all current/unchanged attribs get copied with incremented rev number plus the new attribute. CONS: code overhead in implementation, bloat in the attributes db; or ...
  2. Add two revision fields, revision_start, revision_end. On SELECT, constraint to attributes that are a superset range of the desired revision. When new attributes are added, the revision_end field on attributes that remain attached to the entity is incremented. CONS: even describing it is verbose, code probably a nightmare. PRO: no bloat in attributes.


When I want to track who created/updated/deleted an attribute, I think I have to add yet another table, a journal, to match attribute revisions with users - right?

Probably I'm overseeing a very simple approach, so please, somebody shed some light.

(I've already found these ruby gems (ehem...) with many implementations of versioning of so called Active Records. Is there something to be learned there? Is one of these implementations a good example of how to do it? ...do it well?)
  • Comment on OT: Looking for good solutions to implement revision control with an EAV db model?

Replies are listed 'Best First'.
Re: OT: Looking for good solutions to implement revision control with an EAV db model?
by Corion (Patriarch) on Jun 24, 2014 at 06:57 UTC

    In my opinion, the second approach of having validity dates ("valid from", "valid to") is the better approach, because it allows you to easily time-limit changes. The queries for what is valid "now" are easily hidden in a view.

    As an alternative approach, consider only keeping the current version in one table and the journal of changes in a secondary table. Reconstructing the old situation then consists of reading the data from the current table and reverting all changes to it from the journal, which is hard to get right. But if your concern is mostly auditing and review/selected rollback of the changes, a journal is much easier to handle in my experience.

    You can set up such a journal either with database triggers (foolproof) or by copying the data in your program (less clutter in the journal, also allows for ephemeral information like "change reason", "change place in program" and "changing user", which the database doesn't necessarily know about.

      Thanks for pointing out triggers!

      So, ...a journal. (I hoped for a solution without it.) Actually, in one implementation, it's what I already use. Seems to be the easier method of computing things like a per-user "activity log".

      My challenge with such a journal so far has been: what's the best (actually working/usable) way to interlock eav-table and journal-table? (Suggestions welcome) As one usually has to track who did what, which attributes (attr id) are involved, what was attr name (in case attr name is user changeable), what was value, what are the new values. So rollbacks actually work. But I'm babbling... Probably that's why, apart from perfomance, many people hate EAV: for versioning, the traditional design offers the fuzzy feeling of having a rigid backlog of old version snapshots, ready to be unearthed at any time, diff'ed against any version... The table model is tempting. But then, there's the natural goodness of freely attributing stuff... I guess that's the ongoing struggle of NoSQL or things like BigTable where you can tweak and extend the column schema to the max. Arg! After babbling, now I'm even going into musing... anyway.

      Thank you, Corion!
Re: OT: Looking for good solutions to implement revision control with an EAV db model?
by erix (Prior) on Jun 24, 2014 at 07:35 UTC
      I know, I know...
      Let's regard your note as a "warning sign" for the casual bypasser, and us, so (we all) don't forget that EAV is a sharp tool, only to be handled by professionals, in very special, rare cases ;)