Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.
In a previous SMS post I debunked an attempt to express something important about database practice that was handicapped by lack of foundation knowledge. Here is another example.
“This Codd guy might have been onto something. Unfortunately, normalization is usually taught in a somewhat backwards, overly technical way. If you start with concepts, connections between them and details about them, you usually are already at a fairly high normal form without going through any formal normalization steps.”--LinkedIn.com
------------------------------------------------------------------------------------------------------------------
SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics
column. The site was discontinued in 2018. The content here is not available
anywhere else, so if you deem it useful, particularly if you are a regular
reader, please help upkeep it by purchasing publications, or donating. On-site
seminars and consulting are available.Thank you.
LATEST POSTS
08/25 NOTHING TO DO WITH RELATIONAL (t&n)
08/20 DATABASE RELATIONS, DATABASE DESIGN & CORRECTNESS (sms)
08/14 THE VOCIFEROUS IGNORANCE HALL OF SHAME (t&n)
UPDATES
08/20 Added Logic and databases course to LINKS page.
LATEST PUBLICATIONS
(order from PAPERS and
BOOKS pages)
- 08/19 Logical Symmetric Access, Data
Sub-language, Kinds of Relations, Database Redundancy and Consistency,
paper #2 in the new UNDERSTANDING THE REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition
of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper
#1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my
latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide
Mauri).
USING THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or
acronyms of the terms listed on the FUNDAMENTALS page.
For detailed instructions on how to understand and use the labels in
conjunction with that page, see the ABOUT page.
The 2017 and 2016 posts, including earlier posts rewritten in 2017 were
relabeled accordingly. As other older posts are rewritten, they will also be
relabeled. For all other older posts use Blogger search.
- The links to my AllAnalytics columns no longer work. I re-published only the
2017 columns @dbdebunk, and within them links to sources external to
AllAnalytics may or may not work.
SOCIAL MEDIA
I deleted my Facebook account. You can follow me @DBDdebunk on Twitter:
will link to new posts to this site, as well as To Laugh or Cry? and What's Wrong
with This Picture? posts, and my exchanges on LinkedIn.
------------------------------------------------------------------------------------------------------------------
Misconceptions
If only normalization (a term misused) were taught "in a technical way" -- it is taught wrong!
- "Concepts, connections and details" is vague language.
- Normal forms are features of logical, not conceptual model.
- Given a proper conceptual model, proper design does not produce "fairly high normal form", but fully normalized (5NF) databases.
Fundamentals
Given a conceptual model of some segment of reality of interest as a multigroup -- a collection of related groups of entities, each of which has entities of a single type as members -- if a database is designed such that every such group is represented by a database relation, then all relations are in both 1NF (normalized) and in 5NF (fully normalized). McGoveran had the insight that, formally, this means adherence to three formal database design principles:
- Principle of Expressive Completeness (POEC);
- Principle of Representational Parsimony (PORP);
- Principle of Orthogonal Design (POOD).
In a 5NF relation the only dependencies that hold are functional dependencies (FD) of the non-key attributes on the PK (i.e., there is a 1:1 relationship between every PK value and each corresponding non-key value, but not vice-versa). This theoretical requirement of the RDM ensures correctness: design for semantic consistency and system-guaranteed logical validity, as well as other advantages. This is why we say that every database relation is in 5NF by definition.
Since non-1NF ("nested relations") design is, for all practical purposes, not practiced in the industry, the term normalization (to 1NF) is generally misused to mean further normalization (to 5NF) -- the two should not be confused and lumped together as 'normalization'.
Contrary to the industry practice, you don't start with some arbitrary 1NF design and pass it sequentially through each of the normal forms; rather, you design 5NF relations. Explicit further normalization is necessary only to repair non-5NF designs due to either poor conceptual modeling, or non-adherence to the three design principles, or both.
Notes:
- If you think about FDs, you'll see that, logically, they are an implication of every group having entities of a single type: every descriptive entity property is 'about' the identifying property (name) for all entities in the group (i.e. the PK value).
- The 5NF requirement is part of the current understanding of Codd's work (McGoveran interpretation), not the the traditional (Date) interpretation underlying industry practice.
Setting Matters Straight
We revise the above comment as follows:
Practically all treatments of relational database design are incorrect or misleading. If you (1) model reality as a multigroup consisting of related groups of entities of a single type (i.e., that share the same properties and within-group relationships) and the groups share inter-group relationships and (2) adhere to the three design principles, you produce a 5NF database without any further normalization.
Further reading
Database Design: What It Is and Isn't
Normalization and Further Normalization Part 1: Databases Representing ... What?
Normalization and Further Normalization Part 2: If You Need Them, You're Doing It Wrong
Normalization and Further Normalization Part 3 Understanding Database Design
Normalized, Fully Normalized, Non-Normalized, Denormalized: Clearing the Mess
DATABASE RELATIONS, DATABASE DESIGN & CORRECTNESS
No comments:
Post a Comment