First, the tables were not improperly-designed. They had bad data in them. This is a very important distinction which goes to the heart of the matter.
A table with bad data in it is an improperly-designed table. A properly designed table cannot, by virtue of it being properly designed, have bad data. It might have incorrect data, but the data cannot be bad. I will address your further points in the order you presented them, but this is a critical understanding of RDBMSes that you don't seem to have. (It's rare to see it in someone who hasn't been a DBA, so don't feel bad.)
Another critical understanding that you're missing - the "relational" part in an RDBMS isn't for the table - it's for both the row and how rows in different tables relate. All the columns in the row are related to the primary key column(s) and columns in different tables relate rows together. The "set" part comes in that the keys are unique, not that the results of a query are unique. A table is a set of primary keys, along with the values that they index into. The Perl analogy would be a hash. The keys of a hash are a set, but the values are not. This is a good thing.
No, SQL is not relational. If it was, it would be called RQL for "Relational Query Language," not "Structured Query Language." SQL was designed for Business Analysts, not programmers. It's a 4GL - one of those "natural language programming languages" that were touted in the 80's. That it's a complete failure both as a programming language and as a tool to work with databases is only to have been expected.
... a relation (what we normally think of as a table) is an unordered set (in the mathematical sense) of tuples. ... SQL allows us to add duplicate tuples in tables. That's one of the reasons they're called "tables" and not "relations". Sure, I can put a unique constraint across all of the columns, but why should I?
You're conflating so many different concepts here that I almost don't know where to start.
As for Bob and Alice, there's a very good reason why SQL not only gives back that answer, but SHOULD give back that answer, and it comes down to the Principle of Least Surprise.* Let's say you have the query SELECT foo FROM my_table; That returns back 10 rows. Under the current system, that means that there's 10 rows in my_table and here are the values in the foo column for each one. Under your idea, that means there are 10 distinct foos in my_table, regardless of how many rows.
Now, what hapens when I say SELECT foo, bar FROM my_table;? Under the current system, this still returns 10 rows. Under your system, it could return any number of rows, so long each row it returns is a unique (foo,bar) combination. For people that don't understand anything about set theory, that's somewhat counter-intuitive.
Believe me - I do understand your consternation. The idea that an operation upon a set returns a collection is counter-intuitive to those who understand any set theory at all. But, I point back to my earlier point - the set isn't the rows in the table taken as a whole - it's the range of the function between the primary key and the dependent columns. As long as your query returns the primary key, it is a set. If your query doesn't, then it doesn't returns a set.
To elucidate, let's look very quickly at what a SELECT does, under the hood. The basic operation, as you've noted, is to take a set (the PK tuple), apply a set of restrictions on it (either the PK tuple itself or the values tuple), then return the requested values from all rows that satisfy those restrictions. As long as your table(s) being queried was a set to begin with, the results will also be a set. However, your view of the results may or may not be a set, depending on what you've chosen to see.
Let's take your specific example:
I'm going to rewrite as:SELECT cities.name FROM cities, customers WHERE customers.city_id = cities.city_id AND customers.credit_rating > 700
Assuming that customers and cities are both sets, then we have four sets involved in that query.SELECT (cities.name) FROM (customers JOIN cities USING (city_id)) WHERE (customers.credit_rating > 700)
*: Don't give me that shocked look - you had to have known that the PoLS would make an appearance in this discussion. If only so that I could tweak you about how your surprise was anticipated before you were born. :-)
In reply to Re^3: (OT) Why SQL Sucks (with a little Perl to fix it)
by dragonchild
in thread (OT) Why SQL Sucks (with a little Perl to fix it)
by Ovid
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |