in reply to OT: benefits of database normalization

Howdy!

In the first design, the meaning of the "data" column depends on a non-key column. This would then appear to be a violation of 3NF. The data column has a dependency on the type column, and "type" is not even part of the primary key.

This makes it a lot more work to ensure the integrity of the data. What is there to prevent you from changing the value of the type column in a way that makes the data column meaningless?

Your description of the columns makes it clear that the scheme proposed by Anonymous Monk in Re: OT: benefits of database normalization as "design 3" more accurately captures the data model. In that model, the data type on the column will (in most DMBSs) suffice to keep the data mostly right, with simple referential integrity constraints covering the rest of the bases.

yours,
Michael
  • Comment on Re: OT: benefits of database normalization

Replies are listed 'Best First'.
Re^2: OT: benefits of database normalization
by revdiablo (Prior) on Oct 04, 2004 at 16:41 UTC
    the meaning of the "data" column depends on a non-key column. This would then appear to be a violation of 3NF

    Ah, thank you for spelling that out. This is something I was still curious about. I have read the formal normalization rules many times, but actually seeing which applies where is still difficult for me. Seeing a problem in action, then knowing what it violates, is very helpful.

    "design 3" more accurately captures the data model

    Indeed, I like Design 3 more than my own Design 2. I feel kind of silly not thinking of it right away, but to my own defense, Design 2 was an evolution of what I'll now call Design 0. In Design 0, there were no aliases, only facts. When the idea of aliases came up, I proposed adding another table, and someone else proposed adding another column to the existing table. Thus, this node was born. :-)

      Howdy!

      Indeed, I like Design 3 more than my own Design 2. I feel kind of silly not thinking of it right away, but to my own defense, Design 2 was an evolution of what I'll now call Design 0. In Design 0, there were no aliases, only facts. When the idea of aliases came up, I proposed adding another table, and someone else proposed adding another column to the existing table. Thus, this node was born. :-)

      This makes Design 3 even more clearly the Right Way To Go.

      Adding the elided step, having more than one name for the same fact could be done by simply adding multiple rows with the same fact, but a different name. IIRC, that could be a violation of 2NF, so you extract the "fact" column to a separate table and replace the "complicated" value with a simple key and rely on foreign key constraints to keep things honest. If the "fact" changes, you change one datum in one place to accomplish that. If you want to delete a "fact", either you cascade the delete to kill all the names that refer to it, or you require that the names be nuked first. Adding/changing/deleting names for a given fact is trivial.

      Once the relationship between "name" and "fact" goes beyond one-to-one, you have to factor out the "many" side into its own table that refers to the "one" side.

      ...now if the rules change so that a name could refer to more than one fact, you have to make a third table that relates names to facts...but that is probably absurd for the situation at hand.

      yours,
      Michael