in reply to Re^3: They do respect set theory! (and violate 3NF: yeah GIGO)
in thread (OT) Why SQL Sucks (with a little Perl to fix it)
Hi Ovid, I'm sorry maybe I did get a bit irked here. It seemed to me that you were saying that by adding the identity column you violate normalization and because of that the point that calenda made was wrong. Which is something that I disagree with pretty strongly as I think calenda had made a crucial point. I had read some of your other replies in the thread and you seemed to use this same argument to dismiss several points made elsewhere and altogether I guess it probably rubbed me the wrong way. I apologise for being so crusty, i should know not to reply to nodes before i finish my morning coffee. (Yes its an old excuse, but no less valid because of its age :-)
But regarding data integrity, actually I do think the argument you put forth is unsound. Data integrity is maintained when you have can unambiguously reference a piece of information. When you put an identity column on a table that makes each row in the table unique, thus ensuring entity integrity, which in turn alows you to enforce referential integrity. Just because a table is denormalized doesnt mean its integrity is violated. Denormalization implies that maintaining integrity will be harder work, but it doesnt mean its impossible.
BTW, i can imagine valid scenarios where a table could contain two field duplicated like your Paris,France example where data integrity wouldn't be an issue at all. For instance a table holding localized strings might inadvertantly contain two duplicate strings for two or more languages. For instance German/Austrian/Swiss German and even Dutch, Yiddish Afrikaans, English and several others could share common words for various things such as placenames but still have enough different that you wouldn't bother seperating out the dupes. Likewise you might see semantic duplicates like Paris and PARIS in the same table for similar reasons. None of this stuff violates integrity.
I found a good link on data integrity, unfortunately the formatting on the page is pretty crap, but imo it is a good comprehensive list of the attributes of data integrity. The engine its talking about is Sql Server 7, but it shouldnt matter much for this discussion. Note the key points about entitiy, referential and domain integrity: Entity integrity ensures that each row in the table is uniquely identified, Referential integrity is concerned with keeping the relationships between tables synchronized, and Domain integrity requires that a set of data values fall within a specific range (domain) in order to be valid, with user defined integrity being the last. Only user integrity can be violated by two unambiguous rows having attributes in common, and that type of integrity is outside the range of relational logic.
|
|---|