in reply to OT: benefits of database normalization

I agree with you on Design 1 that, while having the context of 'fact' change based on 'type' may work, I think it is going to cause more problems down the road than Design 2. If you are going to do validation on 'fact', you need to check what the 'type' is. If you do any aggregates/statistics on 'fact', you need to make sure you build your query so that type #2 is included properly. It can be done, but it seems like more work to me.

Personally, I would work Design 2 more along the lines of:

Design 3 fact_id fact ------- ---------------------- 1 foo is a fooity foo 2 bar is a barrish baaar name_id name fact_id ------- ---- ------- 1 foo 1 2 bar 2 3 foog 1
It may just be a personal preference, but I like to keep all my names in one place, and all my facts in one place. I'm interested whether you had any reasons to use Design 2 rather than Design 3?

Replies are listed 'Best First'.
Re^2: OT: benefits of database normalization
by Joost (Canon) on Oct 03, 2004 at 10:40 UTC

      Nothing prevents you from adding a "Type" column to the second table to distinguish the two, if you need to. That way you may make the distinction if you want or treat them the same if you don't. Let's see the simplest query "find the description for this name, I don't know whether its a base name or an alias".

      First design:

      SELECT fact FROM Facts WHERE name = ? and Type = 1 UNION SELECT BaseFacts.fact FROM Facts as BaseFacts JOIN Facts as AliasFacts ON BaseFacts.fact_id = Convert(int,AliasFac +ts.data) WHERE AliasFacts.name = ? and AliasFacts.Type = 2 and BaseFacts.Type = 1 -- and this all works only if you do not ever have an alias to an alia +s!
      Second design:
      SELECT fact FROM Facts WHERE name = ? UNION SELECT fact FROM Facts JOIN Aliases ON Aliases.fact_id = Facts.fact_id WHERE Aliases.name = ?
      Third design:
      SELECT fact FROM Facts JOIN Names ON Names.Fact_id = Facts.fact_id WHERE Names.Name = ?
      Which one do you like best? ;-)

      Jenda
      We'd like to help you learn to help yourself
      Look around you, all you see are sympathetic eyes
      Stroll around the grounds until you feel at home
         -- P. Simon in Mrs. Robinson

      Actually, I wouldn't put that information in the second table, but in the first:

      Design 3a
      
      fact_id fact                   orig_name_id
      ------- ---------------------- ------------
      1       foo is a fooity foo    1
      2       bar is a barrish baaar 2
      
      name_id name fact_id
      ------- ---- -------
      1       foo  1
      2       bar  2
      3       foog 1
      

      If you then say orig_name_id must be unique and not null and must reference an existing name_id, the DBMS will automatically check data integrity for you instead of you having to code an extra function to make sure you don't have too few or too many name_ids that claim to be the original.

      That is one of the main reasons you would want to normalize your data structure: To be able to utilize the DBMS's built-in functions to ensure data integrity.

      I would have made (fact_id, name_id) into a two-column primary key:
      fact_id fact ------- ---------------------- 1 foo is a fooity foo 2 bar is a barrish baaar fact_id name_id name ------- ------- ------- 1 1 foo 1 2 foog 2 1 bar
      Then you could always find the primary name out by querying WHERE fact_id = ? AND name_id = 1.
Re^2: OT: benefits of database normalization
by revdiablo (Prior) on Oct 03, 2004 at 20:05 UTC

    Ah, I like this design even better. It works similarly to many filesystems. The fact is like an inode, and the name is like a link. Which is actually sort of interesting, because the person I was originally discussing with kept bringing up filesystem analogies. Neither he nor I ever took the analogy to its logical conclusion, though. Many thanks for the reply!