Saturday, October 8, 2022

NOBODY UNDERSTANDS NORMALIZATION 1 (sms)



Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

What's right/wrong with this database picture?

“Normalization in relational databases is a design process that minimizes data redundancy and avoids update anomalies. Basically, you want each piece of information to be stored exactly once; if the information changes, you only have to update it in one place. The theory of normal forms gives rigorous meaning to these informal concepts. There are many normal forms. In this article, we’ll review the most basic:
First normal form (1NF)
Second normal form (2NF)
Third normal form (3NF)
There are normal forms higher than 3NF, but in practice you usually normalize your database to the third normal form or to the Boyce-Codd normal form, which we won’t cover here.”

                                                                     --Vertabelo.com

Saturday, September 17, 2022

NEW "DATA MODELS" 1 (t&n)



Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

“Codd's aim was to free programmers from having to know the physical structure of data. Our aim is to free them in addition from having to know its logical structure.”

                                                 --Simon Williams, LazySoft

This series is a re-publication of several DBDebunk 2001 posts in response to Simon Wlliams' so-called "Associative Model of Data", academic claims of superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").

Part 1 is the email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that started the whole thing.

Monday, September 12, 2022

DATABASE DESIGN: THE STATE OF KNOWLEDGE IN THE INDUSTRY



Can you identify all the fallacies and misconceptions in the following online exchange? What is the elephant in the room?
Q: “I have done data normalization on dummy data and would like to know if I did it correctly. If it is done correctly, I would also like to ask two things below, because it is about 3NF.

1NF: This table should be 1NF. 

2NF: I selected composite key (userID and Doors) as they represent minimal candidate key and got three tables applying FD rule.

 

3NF: Applying the rule of transitive dependency on 1st table in 2NF, I got out 4 tables (showing only first two, because the last two remain unchanged).

Questions: Is this database normalisation correct? If not could you point me where I did mistake? If answer on first question is True: Should the last table in 3NF be transformed into two tables, given it is not in correct Third normal form. Two non-key atributes have FD keycode -> accessGroup.”

Sunday, August 28, 2022

NOBODY UNDERSTANDS DATABASE DESIGN 1 (sms)



Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

In a previous SMS post I debunked an attempt to express something important about database practice that was handicapped by lack of foundation knowledge. Here is another example.

“This Codd guy might have been onto something. Unfortunately, normalization is usually taught in a somewhat backwards, overly technical way. If you start with concepts, connections between them and details about them, you usually are already at a fairly high normal form without going through any formal normalization steps.”
--LinkedIn.com

Thursday, August 25, 2022

NOTHING TO DO WITH RELATIONAL (t&n)



Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

This was an email exchange with a reader, first published as ON ON-THE-FLY THINKING in March 2005.

“From a quick one-page marketing article:
"Relational databases are one to two orders of magnitude too slow."
--Michael Stonebreaker, quoted in Data on the Fly, Forbes
The quote is directly from the company's owner who "...created two well-known relational database systems, Ingres and Postgres." Further from the reporter:
"Unlike traditional database programs, Streambase analyzes data without storing it to disk, performing queries on data as it flows."
Hmmm... didn't know that the Relational Model of Data specifically proscribed in-memory implementations.”

Saturday, August 20, 2022

DATABASE RELATIONS, DATABASE DESIGN & CORRECTNESS (sms)



Note: In "Setting Matters Straight" posts I debunk online Q&As that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

“...The relational model organizes data through relations (aka tables). You then normalize it in one of six forms. By normalizing data you:
- Reduce redundancy
- Ensure consistency
- Optimize for atomic inserts, updates and deletes
The biggest drawback ... are keys that let you join different tables across multiple systems.”
                                                                      --LinkedIn.com

Sunday, August 14, 2022

THE VOCIFEROUS IGNORANCE HALL OF SHAME (t&n))



(First published in 2006)

Note: "Then & Now" (T&N) is a new version of the old "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

(Nothing to add on re-publication.)

There is massive ignorance in data management, but the vociferous kind (VI) is the purview of a special breed that is characterized by one or more of the following:

1. Lack of knowledge and understanding of, and appreciation for data fundamentals in general and relational concepts, principles and methods in particular.
2. Unwillingness to let it stand in the way of pronouncing extensively on the subject.
3. Inability and/or unwillingness to respond to evidence of ignorance and/or to reason.
4. Lack of interest — often admitted — in truth and correctness.
5. Focus on self promotion and appeasement of the industry by riding fads, or telling (uninformed) audiences what they want to hear.

The combination of 1 and 2 characterizes the Unskilled and Unaware of It,  2 is the vociferous part. Frankfurt defined 4 as bullshit. 5 is just self-aggrandizing.

View My Stats