Ovid: I've recently purchased the book DATABASE IN DEPTH by C. J. Date and I've been enjoying it very much. It's been quite an eye-opener to discover that most of my gripes about "databases" were actually related to SQL.
To that end, I've accidentally started a lively discussion about the "merits" of SQL at Why SQL Sucks (with a little Perl to fix it). dbdebunk.com is getting mentioned quite a bit (and not always in a favorable light, I might add). (Mine is the root post in the thread). Admittedly, I'm just learning about many of the issues involved. So despite years of working with databases, I may have some of the basics off a bit.
The reason I mention this is because that site is very high profile amongst Perl programmers and convincing the people at that site about the issues with SQL would go a long way to reaching a broad swath of the tech world.
Hugh Darwen: The trouble is, so much has been written about SQL's cardinal sins (three-valued logic, NULLs, and duplicate rows) that many of its other gross errors have not received the publicity they deserve. Which is a shame, because some SQL apologists manage to find ways of defending the cardinal sins, whereas some of the other gross errors are just plain indefensible ... as I tried to point out back in the late 1980s with my Adventures in Relationland series.
- Allowing two or more columns of the same table to have the same name
- Allowing a column to have no name at all
- FROM clause semantics (failure to support natural join implicitly)
In ISBL (invented before SQL) and BS12 (1980) JOIN was natural join. Matching was implicitly on common columns (by name), just as dictated by obvious application of predicate logic, and each common column appeared only once in the result. So you never get duplicate column names and the language doesn't even support dot-qualification, let alone need it.
But even more significant is the effect of natural join when it comes to user-defined relational operators. BS12 allowed a user-defined operator to have parameters of type TABLE (=relation) and return relations. What's more, a parameter of type TABLE didn't have to be of some specific table type. Thus, the body of the function could include, for example, JOIN ( p1, p2 ), where p1 and p2 are parameter names, without needing to know which columns (if any) are the common ones.
In fact, BS12's DIVIDE operator was a user-defined one, made available to all users. Because DIVIDE can be expressed entirely in terms of the primitive operators JOIN, projection, and difference. But that wouldn't be possible in SQL, even if SQL did support user-defined operators of this kind (and it still doesn't, properly, after 25 years of existence!!!), unless your implementation happens to support the NATURAL JOIN that I added to the international standard in 1992 but didn't get widely taken up.
Programmers like languages that they can extend themselves. SQL didn't become one until 1996 and even then it was only its scalar operators that could be added to by users.
I could go on.
C. J. Date: I'd like to thank Ovid for his kind remarks about my book DATABASE IN DEPTH (O'Reilly, 2005)—also to say that I agree with everything Hugh Darwen says in his response.
I'd just like to mention one additional issue. Hugh Darwen mentions "some of the other gross errors" found in One that's not sufficiently appreciated, in my opinion, is the fact that tables in SQL have a left-to-right column ordering. As I wrote in DATABASE IN DEPTH, you might think this state of affairs is hardly very significant; in fact, however, it has some serious consequences. Some of those consequences are discussed in detail in my paper A Sweet Disorder (Ed. Note: now a chapter in a book).
(Originally posted at dbdebunk.com 2/24/06)
No comments:
Post a Comment