Showing posts with label FOPL. Show all posts
Showing posts with label FOPL. Show all posts

Monday, February 5, 2024

METALOGICAL PROPERTIES Part 2: Assertion Predicate



In Part 1 we introduced in the conceptual model (CM) the metalogical designation property. It represents—in the absence of known shared defining properties of an entity type, the designation by a group's definer that an entity identifier (aka assigned name) or property value is a member of the group. Such a group is not a group of entities, but a group of name and property values. In the logical model (LM), it is formalized as a designation predicate (DP) and defines a domain.

In Part 2, we introduce the metalogical assertion property. It represents the assertion by an authorized database user that a specific entity, represented by a tuple, either does or does not correspond to an actual entity in the real world.

Friday, October 13, 2023

EVERYBODY THINK THEY KNOW FIRST NORMAL FORM, BUT NOBODY DOES



“I have read this article in an effort to boost my academic knowledge on data modeling a bit and still have no idea what this academic author wanted to say. Apparently First Normal Form (1NF) doesn't get enough respect and then proceeds to talk about Non-First Normal Form (NFNF). But what about First Normal Form (1NF) damnit.”

By sheer chance this was posted on LinkedIn just after I published my new paper The First Normal Form: A Definitive Guide.

PRACTICAL DATABASE FOUNDATIONS

FIRST NORMAL FORM

A DEFINITIVE GUIDE

(September 2023)

Fabian Pascal

 

Table of Contents

 Introduction

1.      The Normal Form

2.      The First Normal Form

3.      Domain Decomposability & Atomicity

4.      1NF & Tables

5.      SQL & 1NF

5.1.     Repeating Groups & Repeated Attributes

5.2.   Information Principle & SQL
 

Thursday, August 17, 2023

ENTITIES, PROPERTIES AND CODD'S SLEIGHT OF HAND



Note: This a revision of an earlier post

RDM is an application to database management of mathematical relation theory (MRT) consistent with simple set theory (SST) expressible in first order predicate logic (SST/FOPL) that is used to formalize symbolically conceptual models of reality as logical models for database representation.

In RDM a domain can "appear" in multiple relations: the domain represents an abstract property, attributes defined on it represent that property in contexts of specific entity groups that relations represent. For example, attribute SALARY in relation EMPLOYEES represents the property represented by domain MONEY in the context of entities of type Employee and attribute BUDGET in relation DEPARTMENTS represents it in the context of entities of type Department.

Monday, June 19, 2023

PREDICATE LOGIC, SEMANTICS AND RDM (sms)



Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

 

“As I have said many times, if the original relational model had been based on predicate logic and also the semantics and rules of definitions we'd all be better off now. It wasn't. Full stop.”
--Ronald Ross, LinkedIn.com
Assessing such arguments normally requires clarification of what exactly is meant by "the relational model". Ross does refer specifically to the "original" -- which we take to mean that introduced by Codd in 1969-70 -- but given the massive misuse and abuse in the industry, perceptions of it may well be corrupted (Nobody Understands the Relational Model Semantics, Relational Closure and Database Correctness).  Moreover, there are many predicate logic (PL) systems and many ways of categorizing them (1st vs n-th order being only one way) -- we assume Ross means RDM is based on none.

Sunday, May 28, 2023

INTENSION, EXTENSION AND R-TABLES (t&n)



Note: "Then & Now" (t&n) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the disregarded evolution/progress of RDM, I am re-visiting my old debunkings, bringing them up to the current state of knowledge. This will enable you to judge how well arguments have held up and realize the increasing gap between industry stagnation --  and scientific progress.

THEN: THE IMPORTANCE OF RELATIONAL TERMINOLOGY (t&n)

(email exchange with a reader originally published September 2002)

“Saw your latest and once again I think you have hit one of the many protruding nails on the head. Understanding one's data is so central and so crucial and yet so often ignored.

All this talk (not from you, I note) of silver bullets. Nothing new and I wonder if the paying customers and the big-ticket so-called technology strategy companies will ever wise up. Edward de Bono wrote of 'porridge words' that distract thought from the matter at hand. When used sparingly, they can facilitate new lines of thought but when, as they are in this field, they are used so casually and often they blur the real issues. All this technicalese of XML etcetera has this effect on me.

During one of the few times an employer allowed me to help people with logical design, I was having difficulty because the customer's IT staff knew very little English and had perhaps even less database background. I hit on the idea of explaining tables as relations and relations as sentences - sentences that must have the same 'size and shape'. Their faces seemed to light up and when they agreed that they had overloaded some of their tables, I was very pleased with myself. I felt vindicated a few weeks later when I read an article about predicates and propositions that Hugh Darwen had written in the now defunct DBPD magazine, put these thoughts much more precisely than I could, . Of course, the changes created new problems because the database product, like so many others, gave precious few ways to map the logical design to the physical one. But I regarded these as preferable problems since the staff was much more interested in the more concrete physical optimization techniques.

Without any disrespect to Dr. Codd (who I once met but was too awe-struck to ask any questions of), I have often thought that the language used by everybody in the field, with words such as "tables", nearly always brings connotations of physical arrangements to the mind of anybody who has done traditional programming. This seems unfortunate to me. Especially after I read Mr. McGoveran's proposals for results that might embody more than one table. (I wonder if these might not be part of the key for much better physical integration of databases with their visualization for users, not to mention smarter engines.)

I came across a site https://www.mcjones.org/System_R/ the other day, where a bunch of the System R people reminisced about its development on the occasion of, I think, the 25th anniversary of one of Codd's early papers. Presumably Mr. Date was absent from this gathering so that he could write his own most interesting history, which I remember reading five or six years ago. Anyway, I was struck again by how often their design decisions were either determined or distorted by physical considerations. And now, when many of the obstacles have been overcome courtesy of Moore's and other laws, some of those clever people seem regretful.

Also, please let me submit an historical, non-technical 'nit' to Mr. Date - I remember him writing that Codd did not coin the database term 'normalization of relations' as a result of R.M. Nixon's foreign policy excursion with China. But I also remember reading what I recall was an original interview with Dr. Codd in the DBMS magazine where he stated that this was the case. It's not really important, perhaps I'm just sensitive to it because I live in a country that established relations with modern China a year earlier!”

Sunday, April 30, 2023

RELATIONSHIPS AND THE RDM V2 Part 3: SEMANTIC CONSTRAINTS



Note: This is a multipart re-write of a previous series that, when completed, is intended to replace it.

In Part 1 we documented the differences between mathematical and database relations (see table in Part 1). We attributed the fallacy that the RDM can express only one type of relationship -- between relations using FKs -- to the industry being unaware of the adaptation of math relations for database management. We intimated that some of the additional features of database relations express relationships other than between relations.

In Part 2 we identified the intra-group c-relationships (and the corresponding within-relation l-relationships) in our approach to conceptual modeling:

  • Properties-entities relationships

- general dependencies

  • Properties Relationships
  • Entities Relationships

- entity uniqueness
- functional dependencies (FD)
- entity supertype-subtypes relationships

and used a simple conceptual model (CM) of six entity groups to illustrate them:

Customers (cID, cname, FICO, discount)
Products (pID, pname, price)
Salesmen (sID, sname, sales, salary, commission)
Orders (oID, pID, cID, sID, date, amount)
Order Items (oID, iID, pID, quantity)

Database design is the use of a data model (DM) (here, RDM) to formalize conceptual models (CM) -- including c-relationships -- as logical models (LM) for database representation, so it must be able to convert the business rules (BR) that express those relationships in specialized natural language at the conceptual level to formal constraints in a FOPL-based data sublanguage at the logical level.

Our intention is to demonstrate that the RDM can express all c-these relationships, but we face a difficulty.

Monday, January 2, 2023

 NEW "DATA MODELS" 5.2 (t&n)



Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

This is a re-published series of several DBDebunk 2002 posts on Simon Wlliams' Lazy Software so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").

  • Part 1 was an email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that triggered the series.
  • Part 2 was my response to a reader's email questioning our dismissal of Williams's claims.
  • Part 3 was my email exchange with Williams where he provided his definition of a data model on which I conditioned any discussion with him and I debunked it.
  • Part 4 is my response to a reader's comments on my previous posts in the series.
  • Part 5.1 provided the background for my critique of Edward Hurley's report on Simon Williams's Lazy Software and his so-called "Associative Model of Data" (AMD).

Thursday, November 10, 2022

NEW "DATA MODELS" 4 (t&n)



Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

This is a re-published series of several DBDebunk 2001 exchanges on Simon Wlliams' so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").

Part 1 was an email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that started the series. Part 2 was my response to a reader's email questioning our dismissal of Williams's claims. Part 3 was my email exchange with Williams where he provided his definition of a data model on which I conditioned any discussion with him and I debunked it. Part 4 is my response to a reader's comments on my previous posts in the series.

Saturday, October 29, 2022

NEW "DATA MODELS" 3 (t&n)



Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.

This is a re-published series of several DBDebunk 2001 exchanges on Simon Wlliams' so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").

Part 1 was the email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that started the series. Part 2 was my response to a reader's email questioning our dismissal of Williams's claims.  Part 3 is my email exchange with Williams: he provided his "definition" of a data model on which I conditioned any discussion with him and I proved my point by debunking it.

Thursday, August 4, 2022

DATABASE RELATIONS, TABLES AND SEMANTIC CONSISTENCY



by David McGoveran with Fabian Pascal

 

Note: In "Setting Matters Straight" posts I debunk online Q&As that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

“In a RDBMS, a table is columned rows, as in you treat individual rows as an actual entity while the columns are its attributes. In an excel tab, you can create a column, but it doesn't have to have all the same data types in that column, nor does one row have to represent one entity. It's more free form ... All in all, RDB is relational because it's column based rows and constrained to that format, while non relational can have free form like an excel. When you have rows that are uniform (constrained to what the column should be), you create entities as tables, and link them through columns to keep track of the relationships.”
--Quora.com
I posted this on LinkedIn as one of my "To Laugh or Cry?" items which, unlike "Setting Matters Right" posts, are beyond debunking. But the exchange that followed made me realize that there is, nevertheless, pedagogical value to it: it expresses something important, but poorly due to author's lack of foundation knowledge.

Saturday, June 11, 2022

ORDER & RELATIONAL DATABASES (sms)



Note: In "Setting Matters Straight" I post on LinkedIn online Q&As that involve fundamentals under the header "What's Right and Wrong with this Database Picture" and then debunk them here. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

Q: “I'm not sure what this means: "The order of the rows and columns is immaterial to the DBMS?" -- could anyone explain?”

A: “It means two things:
The engine is under no obligation to insert new rows immediately following the previously inserted row(s)... During processing of selects, the optimizer is free to use any index it finds efficient to use or none at all... For this reason, if the order of returned data is important to your processing, then you must include an ORDER BY clause.”

Q: “How do you reorder fields in the database?”

A: “Depends on how you define "reorder". What view of your data are you trying to set the order. Are you in Table Design view? ... Are you looking at form? The answer is different depending on what you are referring to.”
--Quora.com

Saturday, May 21, 2022

NO RDBMS WITHOUT RELATIONAL DOMAINS (obg)



Note: To demonstrate the correctness and stability due to a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, I am re-publishing as "Oldies But Goodies" material from the old DBDebunk.com (2000-06), Judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references. You can acquire foundation knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, even better, organize one of our on-site SEMINARS, which can be customized to specific needs).

The following is an email exchange with a reader and DBMS designer.

ON DATA TYPES AND WHAT A DBMS IS

(originally published in 2001)

Reader:
"I would like to hear your (or Date's) opinion on The Suneido Database … it seems to me self-contradictory. They aren't typed ... so how can they define operators, or even the idea of domains. They also say they include administrative commands, which as far as I understand isn't allowed in the THIRD MANIFESTO. While they do not claim to be an implementation of the Manifesto, their claims that their database language was created by CJ Date do not sound appropriate."

 "They don't know what [domains (distinct from programming data types)] are and what their function in the RDM is. That's common for all DBMS vendors, the claims of which should be always taken with more than a grain of salt."

Monday, April 25, 2022

RELATIONAL DATABASES & SET THEORY (sms)



Note: "Setting Matters Straight" is a new format: I post on LinkedIn an online Q&A involving data fundamentals that I subsequently debunk in a post here. This is to encourage readers to test their foundation knowledge against our debunking here, where we confirm what is correct and correct what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.

Q: “To what extent is relational database theory related to set theory?”

A: “Relational database theory is indeed closely derived from set theory. Many operations in relational data are directly related to common operations one does with sets. In fact, SQL has keywords for them that should sound familiar to someone who has just taken a class in Discrete Mathematics:
  • UNION
  • INTERSECT
  • DIFFERENCE (called MINUS in Oracle)
Even the structure of a table is set-oriented. A table is a set of rows, and a row is a set of columns, and those columns must match the set of columns defined in the table's header.”

--Quora.com

Sunday, February 13, 2022

NO UNDERSTANDING WITHOUT FOUNDATION KNOWLEDGE PART 5: DEBUNKING AN ONLINE EXCHANGE 4 (obg)



Note: To demonstrate the correctness and stability due to a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, I am re-publishing as "Oldies But Goodies" material from the old DBDebunk.com (2000-06), Judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references. You can acquire foundation knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, even better, organize one of our on-site SEMINARS, which can be customized to specific needs).

A 2001 review of my third book triggered an exchange on SlashDot. This six-part series comprises my debunking at the time of both the review and the exchange in the chronological (slightly out of the)  order of the original publication.
Part 1: Clarifications on a Review of My Book Part 1 @DBDebunk.com
Part 2: Slashing a SlashDot Exchange Part 1 @DBAzine.com
Part 3: Slashing a SlashDot Exchange Part 2 @DBAzine.com
Part 4: Slashing a SlashDot Exchange Part 3 @DBAzine.com
Part 5: Slashing a SlashDot Exchange Part 4 @DBAzine.com
Part 6: Clarifications on a Review of My Book Part 2 @DBDebunk.com

Slashing a Slashdot Exchange - Part 1

(first published @DBAzine.com in 2001)

I was recently contacted by a reporter for an interview. When I expressed my disappointment with the trade media’s tendency to regurgitate vendor marketing claims instead of  assessing them, he admitted "that is what happens about 98 percent of the time", but added "There are some outlets with a good piece from time to time that deal with serious architecture issues", mentioning SlashDot as one of them.

There is, of course, a Catch 22 here: to judge the seriousness of such outlets, foundation and substantive knowledge is necessary in the first place. And, alas, reporters possess even less of it than vendors and users (see, for example, The Ignorance Mechanism, On Trade Media’s "Balance"),
without which sources may appear serious even when they are nothing of the sort. As luck would have it, I ran into a good opportunity to prove this point for SlashDot. It so happened that shortly after my exchange with the journalist, Database Debunkings experienced a sudden ten-fold increase in traffic. Now, [given that my target audience is thinking practitioners,] were my material to suddenly become "hot", I would worry as to where I did go wrong. But the odds for that are rather slim and, fortunately, there was no need for concern: an email from a reader informed me that "there recently was an article posted to SlashDot.org which refers to Dbdebunk.com and Mr. Pascal/Date" and "There [were] some 443 comments to that posting." Such volume is practically always indicative of heat (hot air, to be more precise), rather than light. Ah, well, I thought, yet another source of weekly quotes (as if one was needed).

Saturday, January 8, 2022

NO UNDERSTANDING WITHOUT FOUNDATION KNOWLEDGE PART 2: DEBUNKING AN ONLINE EXCHANGE 1 (obg)



Note: To demonstrate the soundness and stability conferred by a sound theoretical foundation (relative to the industry's fad-driven "cookbook" practices), I am re-publishing as "Oldies But Goodies" material from the old (2000-06) DBDebunk.com, so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. In re-publishing I may revise, break into or merge parts and/or add comments and/or references that I enclose in square brackets). 

A 2001 review of my third book triggered an exchange on SlashDot. This six-part series comprises my debunking at the time of both the review and the exchange in the chronological (slightly out of the)  order of the original publication.
Part 1: Clarifications on a Review of My Book Part 1 @DBDebunk.com
Part 2: Slashing a SlashDot Exchange Part 1 @DBAzine.com
Part 3: Slashing a SlashDot Exchange Part 2 @DBAzine.com
Part 4: Slashing a SlashDot Exchange Part 3 @DBAzine.com
Part 5: Slashing a SlashDot Exchange Part 4 @DBAzine.com
Part 6: Clarifications on a Review of My Book Part 2 @DBDebunk.com

Friday, December 17, 2021

NO UNDERSTANDING WITHOUT FOUNDATION KNOWLEDGE PART 2: DEBUNKING A BOOK REVIEW (obg)



Note: To demonstrate the correctness and stability offered by a sound theoretical foundation (relative to the industry's fad-driven "cookbook" practices), I am re-publishing as "Oldies But Goodies" material from the old (2000-06) DBDebunk.com, so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references, which I enclose in square brackets).

A 2001 review of my third book triggered an exchange on SlashDot. This six-part series comprises my debunking at the time of both the review and the exchange in the chronological (slightly out of the)  order of the original publication.
Part 1: Clarifications on a Review of My Book Part 1 @DBDebunk.com
Part 2: Slashing a SlashDot Exchange Part 1 @DBAzine.com
Part 3: Slashing a SlashDot Exchange Part 2 @DBAzine.com
Part 4: Slashing a SlashDot Exchange Part 3 @DBAzine.com
Part 5: Slashing a SlashDot Exchange Part 4 @DBAzine.com
Part 6: Clarifications on a Review of My Book Part 2 @DBDebunk.com 

Clarifications of a Review of My Book Part 1

(originally posted 1/14/2001)
“Many of us ... do not think that harmony is the great goal, or unity or peacefulness, [and] actually quite like hard questions for their own sake, and enjoy ... the life of the mind. To the question of how to live, the answer is "by disagreement.” --Christopher Hitchens

Let me say, first and foremost, that as the subtitle of the book -- A REFERENCE FOR THE THINKING PRACTITIONER -- indicates, it is targeted at the minority of practitioners who think clearly, independently and critically. It should not be a surprise, then, that those not belonging to that (alas, very small) target audience don't see its practical value. As I said so many times, if my work gained mass appeal, I would wonder what I was doing wrong. This is the sad reality, whether we like it or not. In fact, to be consistent I will go one step further: I don't assume that positive reviews are any better than negative ones -- they are frequently grounded in as faulty reasoning and/or ignorance as the critiques.

Let me also make clear that I do not place all of the blame on  the individual database practitioners or users. Rather, problems are rooted in a systemic, much more profound societal and business culture that fails to instill and encourage foundation knowledge and independent, critical thinking, which not only does not reward, but actually punishes such. This is true to a degree in all societies, of course, but in the US the problem is much more acute (there can hardly be a better demonstration of the horrendous implications of this than how the election was covered, perceived and accepted by most of the press and public) [I wrote this prior to the last two elections -- I leave it to the reader to judge the steepness of the subsequent regress.]

Saturday, December 11, 2021

NOBODY UNDERSTANDS THE RELATIONAL MODEL: SEMANTICS, CLOSURE & CORRECTNESS PART 4



with David McGoveran

(Title inspired by Richard Feynman)

In Parts 1 and Part 2 we provided some clarifications following a discussion on LinkedIn about our contention that, conventional wisdom notwithstanding, database relations -- distinct from mathematical relations -- are by definition not just in 1NF, but also in 5NF, as a consequence of which the RA, as currently defined for 1NF closure, produces what the industry calls "update anomalies" and, thus, is not a proper algebra. In Part 3 we used that information to debunk some leftover misunderstandings in the discussion.

We conclude in Part 4 with comments on a private exchange that followed the public one on LinkedIn regarding the difference between the McGoveran (DMG) and Date and Darwen's (TTM)
interpretations of the RDM, which can be summarized as follows:

Friday, November 19, 2021

THE FATE OF FADS: XML DBMS (obg)



Note: To demonstrate the correctness and stability due to a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, I am re-publishing as "Oldies But Goodies" material from the old DBDebunk.com (2000-06), so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references.

Remember XML DBMS? At one point it was the fad of the day, similar to today's NoSQL or the old new "knowledge graph" -- "the future that you ignored at the peril of being left behind". As I predicted, it went the way of all fads (ODBMS, Associative DBMS, you name them) together with their "data models" that were nothing of the sort. My prediction was grounded in the same sound foundations I rely on today -- unlike the industry we are progressing it -- that fads lack and which were and still are dismissed, evidence be damned.

Here's a typical example (comments on republication in square brackets).

Thursday, November 11, 2021

Nobody Understands the Relational Model: Semantics, Relational Closure and Database Correctness Part 2



 with David McGoveran 

(Title inspired by Richard Feynman)

In Part 1 we explained that all database relations are, mathematically, relations, but not all relations are database relations, which are in both 1NF and 5NF and we agreed with a statement in a LinkedIn discussion ending as follows: "Update anomalies are not as big of a problem as an algebra where relations aren't closed under join". Unfortunately, update anomalies, closure, and how relational operators were defined are all interrelated and represent an even "bigger problem". Update anomalies are not "bugs", let alone irrelevant, but actually a reflection of  that much bigger problem.

In this second part we delve into that problem.

Wednesday, October 27, 2021

Nobody Understands the Relational Model: Semantics, Relational Closure and Database Correctness Part 1



 with David McGoveran 

(Title inspired by Richard Feynman)

“As currently defined, relational algebra produces anomalies when applied to non-5NF relations. Since an algebra cannot have anomalies, they should have raised a red flag that RA was not defined quite right, especially defining "relation" as a 1NF table and claiming algebraic closure because 1NF was preserved. Being restricted to tabular representation as the "language" for relationships is like being restricted to arithmetic when doing higher mathematics like differential calculus -- you need more expressive power, not less! Defining RA operations in terms of table manipulations aided initial learning and implementations by making data management look simple and VISUAL. Unfortunately, it was never grasped how much was missing, let alone how much more "intelligent" the RA and the RDBMS needed to be made to fix the problems. And I can see that those oversights were, in part, probably due to having to spend so much time correcting the ignorance in the industry."
--David McGoveran
I recently posted the following Fundamental Truth of the Week on LinkedIn, together with links to more detailed discussions of 1NF and 5NF (see References):
“According to conventional understanding of the RDM (such as it is) [and I don't mean SQL], a relation is in at least first normal form (1NF) -- it has only attributes drawn from simple domains (i.e., no "nested relations") -- the formal way of saying that a relation represents at the logical level an entity group from the conceptual level that has only individual entities -- no groups thereof -- as members. 1NF is required for decidability of the data sublanguage.

However, correctness, namely (1) system-guaranteed logical validity (i.e., query results follow provably from the database) and (2) by-design semantic consistency (of query results with the conceptual model) requires that relations are in both 1NF and fifth normal form (5NF). Formally, the only dependencies that hold in a 5NF relation are functional dependencies of non-key attributes on the PK -- for each PK value there is exactly one value of every corresponding non-key attribute value. This is the formal way of saying that a relation represents facts about a group of entities of a single type.

Therefore we now contend that database relations are BY DEFINITION in both 1NF and 5NF, otherwise all bets are off.”
It triggered a discussion that raised some fundamental issues for which an online exchange is too limiting. This post offers further clarifications, including comments by David McGoveran, on whose interpretation of the RDM (LOGIC FOR SERIOUS DATABASE FOLKS, forthcoming) I rely on. The portions of my interlocutor in the discussion are in quotes.

View My Stats