in reply to Re: Re: Well Designed RDBMS
in thread Well Designed RDBMS

A well-designed relational database should be normalized, and most of these examples are not because they mix different types of information in the same table. You also are forcing all attributes to be coerced to strings. Here is another approach (supposing Color and Size are attributes some blobs have and not others). You can add as many attributes as you like by adding tables.
Blob blob_id INT (primary key) blob_data BLOB Color blob_id INT (foreign key to Blob) color_value COLOR Size blob_id INT (foreign key to Blob) size_value FLOAT
Searching on attributes (especially single attribute values) should actually be more efficient this way. There's a single type of index and an easy JOIN operation (which you should let the DBMS do for you instead of making two queries).

Replies are listed 'Best First'.
Re: Re^3: Well Designed RDBMS
by nothingmuch (Priest) on Apr 07, 2003 at 19:14 UTC
    I'm a wee bit scared of adding a gazzillion tables, as attributes may pushing tens in their types, but i guess this is similar to creating a key value pair of attributes, at least in how different it is from creating a blob object.

    Right?

    -nuffin
    zz zZ Z Z #!perl
      tall_man's warning about the risk of spelling errors in attribute names is worth heeding. If your potential range of attributes is in the tens (or dozens), you could have a single attribute table keyed by blob_id, with a separate column (suitably named) for each possible attribute.

      If you need to add more attributes as time goes by -- and if you don't mind the slight extra effort needed each time you add one -- you can alter your attribute table at any time by just adding columns.

      This way, you have just two tables, and doing a join-type query is really simple: "select blob_table.data from blob_table, blob_attrib where blob_table.id = blob_attrib.id and blob_attrib.color = 'blue' [and blob_attrib.flavor = 'grape' and ...]" ...whatever.

      Yes, I would say that it's closer to your first approach (attributes in a key/value table) than to your second (attributes in a blob you have to decode yourself).

      Another couple of reasons that completely separate tables might be better: there is no redundant attribute-key field to store, and you avoid attribute-key spelling errors (like "color" vs. "colour"). If the attributes are entered through a user interface, how are you going to check for valid attribute names?