in reply to (OT) Why SQL Sucks (with a little Perl to fix it)

Congratulations. You have just come to a design decision in Microsoft Access that bit me hard once.

Access puts DISTINCT on all queries by default. I learned this the hard way as a very junior programmer when I was trying to figure out the following. There was a table of receipts. The table was queried to find the sum of the receipts. Then it was queried to pull back information about the receipts to dump into Excel. As a first sanity check, the Excel dump was summed and came up with a different total than Access had given!

The problem was that the dump for Excel didn't include the primary key, so there were duplicate rows that were merged by Access. But those duplicate rows were not redundant information - the counts of how many you had were very important!

So there was a real life case where DISTINCT clearly is not the right thing to do.

Now with that said, I will read the book, but I'm not convinced that I will agree with it. My main problem with databases isn't that I encounter normalization problems or have trouble understanding the relational logic. It is that I am asked to do things with it that can't be expressed efficiently in a relational manner.

Perhaps you consider that an optimization problem, and lazily wave the database should take care of it. But I can't. I need to solve those problems, the database isn't taking care of it for me. In fact Oracle cheerfully tells me that they won't take care of it, and if I want to take care of it for myself, then I need to use their analytic extensions.

So while I'm willing to read their promotional literature, I'm not about to get the relational religion.

  • Comment on Re: (OT) Why SQL Sucks (with a little Perl to fix it)

Replies are listed 'Best First'.
Re^2: (OT) Why SQL Sucks (with a little Perl to fix it)
by demerphq (Chancellor) on Dec 13, 2005 at 09:08 UTC

    It is that I am asked to do things with it that can't be expressed efficiently in a relational manner.

    I think it would be interesting to have a better description of such tasks. At the very least some our SQL deficient readers will get an idea of things that dont work well in relational databases.

    One such example that comes to mind is storing heirarchies. Relational semantics and directed acyclic graphs dont mix nicely. :-)

    ---
    $world=~s/war/peace/g

      The example that I am most commonly interested in is maintaining running totals. For instance given a table with payments and receipts, determine what our lowest account balance was.

      It isn't all that hard to write that in standard SQL. But the solutions that you get won't execute very quickly...

        Thanks, thats an interesting example. Just out of curiosity why do you calculate such totals from an accounts-receivable table? Wouldn't it be easier to have a seperate totals table and update it every time a new record was added? Either in a transction explicitly do two updates, or use a trigger on the accounts-receivable table that does it automatically....

        ---
        $world=~s/war/peace/g

        The canonical answer is to use a view on the accounts_receivable table. The problem is that your accounts_receivable table is too normalized for adequate performance. You have to create a denormalized view in order to get what you're looking for.

        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?