in reply to Flexible Database Updates

I've just been looking at this from the static code generation side.

Consider if your database may have arrays. This can
double the complexity of building your statements.

Replies are listed 'Best First'.
Re: Re: Flexible Database Updates
by grep (Monsignor) on Sep 08, 2002 at 20:31 UTC
    Arrays or extents (or what ever your RDMS wants to call them) are bad from a database design standpoint and should not be used, unless you have one helluva good reason (I personally have never had a good reason to use them, ever).

    Extents break 1NF and are the equivalent of either:

    1. Table CD: cd_name, artist, song1, song2, song3,...

    2. comma delimited strings in fields

    This makes for bad code - code which has to cover for bad database design.

    So it is not a question of 'This can double the complexity of building your statements'. Array/extents will add unneeded complexity of your code, period.

    An article on Database Normalization should explain why this is bad



    grep
    Mynd you, mønk bites Kan be pretti nasti...
      I think grep & I agree that using the array mechanisms
      of RDMSes is usually a bad thing.
      I disagree that "Array/extents will add unneeded complexity"
      to your code period.

      I do believe that fixed size arrays can be and often are
      best put in a single record. Some examples are so often
      done that they are invisible.

      First, middle, and last- names are an example that could be

      create table name ( id keytype, -- constrained to parent seq integer, -- place in name set name text );

      but who wants to deal with that?

      The usage of address_line_1 and address_line_2 can be
      viewed as an array embedded in a record.
      Financial applications show arrays for the 12 or 13 months
      of the year.
      I used a pair of arrays to describe a set of items. There
      could be upto 5 dimensions & 5 dimension_names.
      All of these belonged in one record.

      That the fullname or address should be one field is a
      excellent alternative, but not always do-able.
      Printing envelopes with a single address string may
      be problematic.

      Using a single fullname may induce the use of a nickname,
      possibly recreating the initial issue.

      The solution that I am currently implementing/trying is
      to use arrays on the Perl side and separate fields on the
      SQL side. A naming convention maps $o->{elem}[0] to elem_0,
      and $o->{address1} to address1.

        I do believe that fixed size arrays can be and often are best put in a single record. Some examples are so often done that they are invisible.

        This violates 1NF (Database Normalization First Normal Form) a field like that should be broken into another table by a relationship

        eg.

        CD |Tracks ------------------------------------------ 1984 |Jamie's Crying,Jump,... Diver Down |Little Guitars,Pretty Woman,... Body Count |Evil Dick,Cop Killer,...

        This should end up being broken up into a CD table and a tracks table. Imagine a world where 'Cop Killer' was removed from the CD. Well now you have to retrieve that record parse it in to seperate entries remove the song then string is back together then do your update. As opposed to finding the song in the tracks table and removing it with one DELETE statement. You have now taken a one line operation and made it 2 SQL statements, a split, a removal and a join. You now have at least 5 lines of code to make a mistake in instead of 1.

        So your solution is 5X's the amount code that you could've written, not too invisible to me (nor is it very Lazy).

        First, middle, and last- names are an example that could be

        create table name ( id keytype, -- constrained to parent seq integer, -- place in name set name text );

        This is not a good example. The 'name' field should follow your business rules. If you business rules state implicitly or imply that name should be broken into 'First','Last', and 'Middle' then your data should reflect that. If your business rule don't state or imply that then you should not break them up. Either way you have a distinct set of data you need to deal with.

        Any time you split data out of a field, you fall into the trap of having your code do what your data model should support

        These are just some of the reasons for database normalization, there are many others (including preventing update anomalies). If you follow the rules database normalization for your schemas, I promise you will write less buggy code



        grep
        Mynd you, mønk bites Kan be pretti nasti...