Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Guys and gals, I am working on a Perl/MySql project at the moment, and there are some database design consideration/heated argument going on. I would like to hear from monks, and also it would be great if monks can point me to some "more authoritative/official" sources on those particular (database design) issues - you know how some time you need those things to convince people:
Issue 1: I have a table that stores all different types of items (ship case, supplier case, pallet etc.) This is one common table. Besides the fact that different types of items share common attributes/columns, obviously some columns are only applicable to particular types. The current design emphasizes the commonality, and if a particular column does not apply to a particular row, that column is simply set to NULL for that row.
Now some people on the team want to split this common table into several tables, and one for each type of item. To me that is absolutely unnecessary. Use a more common analogy: In a company there are managers and other employees, those two types of people share some common attributes, but there are also attributes that apply to one type but not the other. When we deign database, is there a need to use two tables - a MANAGER table and a NON_MANAGER table, or simply use one - EMPLOYEE table. Obviously some columns will be NULL from time to time (that's why we have NULL, otherwise every time one sees a NULL, a new table should be created to avoid NULL). Some people say that breaks NF, but after seriously checked all 5 NF, the one table solution breaks absolutely none NF, although it might waste a bit of space.
Issue 2. Generalization. Something slightly different from above. Say we have a person table (T1), a project table (T2), and bunch of other tables for different entities/things: T3, T4... Tn. We have the need to attach documents to those things - being a person, or a project, or whatever. Say all attachment are store in a table called T'. Now we also need bunch of relational tables to link attachments to entities - the link between person and his attachment/document will be stored in table T1_T', and the link between a project and its documents will be stored in table T2_T', this goes on and on.
I am trying to simplify the data model, and would like to create a higher-level table called ENTITY, which stores nothing but all the identifiers: all employee ID, all project ID, this ID, that ID... Obviously person table will have foreign key referencing ENTITY table, and project table will have the same foreign key, as well as all other tables. Now the link to attachment is really simple, instead of having n relational tables, I only need one: the one between ENTITY table and attachment table.
I believe my design is correct in both cases, and I also believe those are common issues to lots of monks. But other than knowing what is right, I also need to show evidence that what I am doing is common practice and there are theoretical papers/books/TERMINOLOGIES/web sites behind, otherwise the discussion can go on for quite a while...
Thanks in advance. Many thanks!
|
|---|