Friday, December 2, 2022

NOBODY UNDERSTANDS FURTHER NORMALIZATION 4 (sms)



Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

In Part 3 we set the matter straight about normalization to 1NF. In this part we do it wit respect to further normalization to 5NF. Non-1NF relations (i.e., with relation-valued attributes) are no longer part of industry practice, so we focus on 2NF-5NF violations. The term further normalization originates with Codd, who initially thought 1NF was sufficient and 2NF-5NF were discovered later (hence, further = beyond 1NF). The industry lumps both under normalization, but the two are distinct (e.g., only further normalization involves redundancy).

What's right/wrong with the following?

“So, what is this theory of normal forms? It deals with the mathematical construct of relations (which are a little bit different from relational database tables). First, second, and third normal forms are the basic normal forms in database normalization. Normalization in relational databases is a design process that minimizes data redundancy and avoids update anomalies. Basically, you want each piece of information to be stored exactly once; if the information changes, you only have to update it in one place. The normalization process consists of modifying the design through different stages, going from an unnormalized set of relations (tables), to the first normal form, then to the second normal form, and then to the third normal form.”
--Vertabelo.com

Thursday, November 10, 2022

NEW "DATA MODELS" 4 (t&n)



Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

This is a re-published series of several DBDebunk 2001 exchanges on Simon Wlliams' so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").

Part 1 was an email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that started the series. Part 2 was my response to a reader's email questioning our dismissal of Williams's claims. Part 3 was my email exchange with Williams where he provided his definition of a data model on which I conditioned any discussion with him and I debunked it. Part 4 is my response to a reader's comments on my previous posts in the series.

Thursday, November 3, 2022

NOBODY UNDERSTANDS NORMALIZATION 3 (sms)



09/19/23: For the latest on 1NF see: FIRST NORMAL FORM - A DEFINITIVE GUIDE

 

Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

(Continued from Part 2)

In this part we set matters straight about first normal form (1NF)

What's right/wrong about this database picture?

“A relation is in first normal form (1NF) if (and only if):
  • Each attribute contains only one value.
  • All attribute values are atomic, which means they can’t be broken down into anything smaller.
In practice, 1NF means that you should not have lists or other composite structures as attribute values. Below is an example of a relation that does not satisfy 1NF criteria:This relation is not in 1NF because the courses attribute has multiple values.

STUDENT-COURSES
==================================================
 STUDENT          COURSES
-=======================--------------------------
 Jane Smith       Databases, Mathematics
 John Lipinsky    English Literature, Databases
 Dave Beyer       English Literature, Mathematics
--------------------------------------------------

To transform this relation to the first normal form, we should store each course subject as a single value, so that each student-course assignment is a separate tuple.”

--Vertabelo.com

Saturday, October 29, 2022

NEW "DATA MODELS" 3 (t&n)



Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

This is a re-published series of several DBDebunk 2001 exchanges on Simon Wlliams' so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").

Part 1 was the email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that started the series. Part 2 was my response to a reader's email questioning our dismissal of Williams's claims.  Part 3 is my email exchange with Williams: he provided his "definition" of a data model on which I conditioned any discussion with him and I proved my point by debunking it.

Sunday, October 23, 2022

NOBODY UNDERSTANDS NORMALIZATION 2 (sms)



Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

(Continued from Part 1)

What's right/wrong about this database picture?

“So, what is this theory of normal forms? It deals with the mathematical construct of relations (which are a little bit different from relational database tables). The normalization process consists of modifying the design through different stages, going from an unnormalized set of relations (tables), to the first normal form, then to the second normal form, and then to the third normal form.”
--Vertabelo.com

Misconceptions

  • All database relations are, mathematically, relations, but not all mathematical relations are database relations.
  • The tabular structure play practically no role in RDM.
  • In practice there is no normalization (to 1NF) and there should not be further normalization (to 5NF).
  • Further normalization does not go from 2NF sequentially through 3NF and 4NF to 5NF.

Sunday, October 16, 2022

NEW "DATA MODELS" 2 (t&n)



Note: "Then & Now" (t&n) is a new version of what used to be the "Oldies but Goodies" (obg) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

This is a re-published series of several DBDebunk 2001 exchanges on Simon Wlliams' so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").

Part 1 was the email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Simon Williams' claims that started the series. Part 2 is my response to a reader's email questioning our dismissal of Williams's claims.  (The reader's comments are in quotes.)

Saturday, October 8, 2022

NOBODY UNDERSTANDS NORMALIZATION 1 (sms)



Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

What's right/wrong with this database picture?

“Normalization in relational databases is a design process that minimizes data redundancy and avoids update anomalies. Basically, you want each piece of information to be stored exactly once; if the information changes, you only have to update it in one place. The theory of normal forms gives rigorous meaning to these informal concepts. There are many normal forms. In this article, we’ll review the most basic:
First normal form (1NF)
Second normal form (2NF)
Third normal form (3NF)
There are normal forms higher than 3NF, but in practice you usually normalize your database to the third normal form or to the Boyce-Codd normal form, which we won’t cover here.”

                                                                     --Vertabelo.com
View My Stats