in reply to Re: Version Control in Database Applications
in thread Version Control in Database Applications

I think you may find that a composite key on two INTEGERs is faster than a single VARCHAR key (and smaller as well). Besides the composite key allows easier gathering of history for a single item. Compare:
SELECT * FROM tablename WHERE key = 12345 ORDER BY version;
and
SELECT * FROM tablename WHERE key LIKE '12345,%' ORDER BY key;
The first most likely will get to use an index on key (especially if you build the index with key first and version second) and the second is most likely a table scan. Here are the singleton selects for the current version
SELECT * FROM tablename o WHERE key = ? AND version = ( SELECT max(version) FROM tablename i WHERE i.key = o.key )
vs.
SELECT * FROM tablename WHERE key = ( SELECT max(key) FROM tablename WHERE key LIKE '$key,%' )

Replies are listed 'Best First'.
Re: Re: Re: Version Control in Database Applications
by samtregar (Abbot) on Jun 10, 2002 at 20:20 UTC
    Re: Speed. My guess is that given an indexed fixed-width CHAR field I should be able to get acceptable speed on lookups. All I need to do is decide on the maximum total number of objects and the maximum number of versions and I should be able to pick a field width. I'll be using MySQL and my memory is that MySQL indexes CHARs just as well as it does INTs.

    Re: Querying. I think you misunderstood (or I did). In my planned implementation the current version would always exist without an appeneded ",version". Thus, selecting the current version is just:

    SELECT * FROM foo WHERE id = "1024"

    No sub-select required! And selecting version N is simply:

    SELECT * FROM foo WHERE id = "1024,N"

    Now, I may still need an extra column to store the current version number but as long as it isn't needed on every select and every join I'll be happy.

    I think you're correct about collecting aggregate information across versions being hard. But I don't anticipate needing to do that very often, if at all.

    -sam

      But how do you make an old version current? Change it's primary key? Nasty. And if you don't do that, you'll have to know which version is live, and that means your queries won't be so simple becuase they'll need to look up which version is live first and won't be able to assume it's a simple key with no appended number. Sounds like a tangle to me. I would go with the two-part key.
        Yes, of course my system requires that I update the key to implement revert. For example, let's say there are 10 versions of foo #1024: (1024, "1024,1" .. "1025,9"). To revert to version 7 I do:

        DELETE FROM Foo WHERE id = "1024" OR id = 1024,9" OR id = "1024,8"; UPDATE Foo SET id = "1024" where id = "1024,7";

        What's wrong here? You say it's "nasty" but I don't see anything particularly bad about it.

        The thing is, reversion is an uncommon operation. It can be a little hairy and nothing bad will happen. There will only be one place in the code handling reversion. However, linking between objects in join tables and general querying will happen all over the place. If that requires two columns per table for every join I know we'll have problems.

        -sam