Corion has asked for the wisdom of the Perl Monks concerning the following question:

I have this great module, but I don't have a good name for it. Since the name should possibly indicate what the module does, I'll quickly describe what the module does and what use I see for it:

The module allows me to dynamically add (and maybe later remove) columns to a table in a database. It creates the correct SQL ALTER TABLE statements for me and if I want to, takes care of executing these statements as well.

In addition, it provides a convenient way to query that table with SQL::Abstract-like queries and a very ad-hoc reimplementation of the boring parts of Class::DBI because I didn't want to fight against the more interesting parts of it.

For the time being, it has been named Table::Denormalized, because that is what it represents to me.

Now, here's my use case: Let's say we want to store properties of files in a database, and we (currently) have three kinds of files, plain files, mp3 files and image files. We suspect there will be more kinds of files which will likely be recognized and parsed by some kind of plugin. So we want to have some kind of table which has a dynamic set of columns to hold the properties of the files. Not every file will have all properties though.

In a normalized database design, those "dynamic properties" would live in separate tables and the main table would hold all rows together by providing a unique ID for all other tables:

image_properties file mp3_properties id INTEGER --> id INTEGER <-- id INTEGER width INTEGER name VARCHAR(1024) playlength INTEGER height INTEGER size INTEGER artist VARCHAR(1 +024) mime_type VARCHAR(64)

This normalized setup would allow me to add new column sets by adding a new table, and it would also allow me (and the database) to just retrieve the needed data instead of fetching lots of data that is likely not used anyway.

The drawback to this normalized scheme is that queries and updates to it are not as simple as queries to a single table, as I have to join the other tables to the table. Such joined queries are slow(er), too.

With "Table::Denormalized", the above layout could be merged into one table conveniently (and transparently):

file_denormalized id INTEGER PRIMARY KEY -- file properties name VARCHAR(1024) size INTEGER mime_type VARCHAR(64) -- mp3 properties playlength INTEGER artist VARCHAR(1024) -- image properties width INTEGER height INTEGER

This layout makes querying for items trivial, but it comes with two drawbacks - if I mostly store image-rows in the database, I will waste lots of space for the mp3 properties. And more importantly, a file can only ever have a single artist (unless I really sell my soul and add a artist2 column).

As I said above, the module is currently named Table::Denormalized. I think it should be named DBIx::Table::Denormalized.

The "other" name I have been considering is DBIx::Table::Dynamic, which is more about what the module allows one to do, to dynamically add (and later maybe even remove) columns to a table.

Any input on this is welcomed.

Replies are listed 'Best First'.
Re: RFC: How to name it? DBIx::Table::Denormalized, DBIx::Table::Dynamic, ...
by g0n (Priest) on Sep 14, 2005 at 17:43 UTC
    FWIW, I think DBI::Table::Dynamic is pretty descriptive.

    --------------------------------------------------------------

    $perlquestion=~s/Can I/How do I/g;

Re: RFC: How to name it? DBIx::Table::Denormalized, DBIx::Table::Dynamic, ...
by jZed (Prior) on Sep 14, 2005 at 17:35 UTC
    What about naming it SQL::Abstract::Alter - isn't that what it does, provide a cross-DBMS ALTER capability to SQL::Abstract?

      No, it only accepts SQL::Abstract, and uses it, but it doesn't produce any SQL::Abstract stuff, and doesn't add the "ALTER" statement capability to SQL::Abstract. It alters the table in the DBI directly.

        Ok, but I think Alter should be in the name somewhere. "Dynamic" can mean many different things and "dynamically alter column structure" isn't the first that jumps to my mind. "Denormalized" may be one thing it does, but it sort of pre-judges how someone would use it.

        Hi,

        Which DBMS's have you tried this on? I think it is a good idea for development/test environments.

        Update:

        hmm... This feature appears to be a variant of a "VIEW". Basically your altered table would be this as a view:

        create view file_denormalized as select a.id, a.name, a.size, a.mimetype, b.playlength, b.artist, c.width, c.height from file a, mp3_properties b, image_properties c

        For DBMSs that don't have support for the view, I think your module would be wonderful :)

        Jason L. Froebe

        Team Sybase member

        No one has seen what you have seen, and until that happens, we're all going to think that you're nuts. - Jack O'Neil, Stargate SG-1

Re: RFC: How to name it? DBIx::Table::Denormalized, DBIx::Table::Dynamic, ...
by Juerd (Abbot) on Sep 15, 2005 at 11:41 UTC

    Why not simply create a table "properties", with three fields: id, key and value?

    File Property .id .id .name .file .size .key .type .value
    You can decide yourself which propertise are important enough for in the File table, but I'd just put almost all in Property, for ease of programming:
    File Property .id .id .name .file .type .key .size .value .parent File id name type size parent 1 ROOT DIR NULL NULL 42 foo FILE 123 1 Property id file key value 9 42 content_type image/x-xyzzy 10 42 width 1024 11 42 height 768 13 42 bpp 16
    This, with the glue to make it automatic in Perl (which you could even do with AUTOLOAD), seems much easier to me, and can work with non-mysql databases.

    Growing tables horizontally automatically is a crime, as much as having indexes in table or column names is.

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

      I went that way already, and while the flexibility is really nice, the performance and programmatic handling are both abysmal. Also, implementing EAV (entity, attribute, value) in a relational database seems somehow wrong :) I also thought of mixing the two approaches, but that can happen later on.

      The code is currently only geared towards SQLite and not MySQL (which I mostly avoid), but I don't know of problems with other databases.

        I'm using alike setups in several places, without abysmal performance. I do wonder what we do differently. This is the classic approach, used by many. Did you perhaps forget the index on Property.file? That's kind of important to have.

        Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }