Note: This a revision of an earlier post
RDM is an application to database management of mathematical relation theory (MRT) consistent with simple set theory (SST) expressible in first order predicate logic (SST/FOPL) that is used to formalize symbolically conceptual models of reality as logical models for database representation.
In RDM a domain can "appear" in multiple relations: the domain represents an abstract property, attributes defined on it represent that property in contexts of specific entity groups that relations represent. For example, attribute SALARY in relation EMPLOYEES represents the property represented by domain MONEY in the context of entities of type Employee and attribute BUDGET in relation DEPARTMENTS represents it in the context of entities of type Department.
Thursday, August 17, 2023
ENTITIES, PROPERTIES AND CODD'S SLEIGHT OF HAND
Monday, June 19, 2023
PREDICATE LOGIC, SEMANTICS AND RDM (sms)
Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.
Assessing such arguments normally requires clarification of what exactly is meant by "the relational model". Ross does refer specifically to the "original" -- which we take to mean that introduced by Codd in 1969-70 -- but given the massive misuse and abuse in the industry, perceptions of it may well be corrupted (Nobody Understands the Relational Model Semantics, Relational Closure and Database Correctness). Moreover, there are many predicate logic (PL) systems and many ways of categorizing them (1st vs n-th order being only one way) -- we assume Ross means RDM is based on none.“As I have said many times, if the original relational model had been based on predicate logic and also the semantics and rules of definitions we'd all be better off now. It wasn't. Full stop.”--Ronald Ross, LinkedIn.com
Monday, January 2, 2023
NEW "DATA MODELS" 5.2 (t&n)
Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.
This is a re-published series of several DBDebunk 2002 posts on Simon Wlliams' Lazy Software so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").
- Part 1 was an email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that triggered the series.
- Part 2 was my response to a reader's email questioning our dismissal of Williams's claims.
- Part 3 was my email exchange with Williams where he provided his definition of a data model on which I conditioned any discussion with him and I debunked it.
- Part 4 is my response to a reader's comments on my previous posts in the series.
- Part 5.1 provided the background for my critique of Edward Hurley's report on Simon Williams's Lazy Software and his so-called "Associative Model of Data" (AMD).
Thursday, November 10, 2022
NEW "DATA MODELS" 4 (t&n)
Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.
This is a re-published series of several DBDebunk 2001 exchanges on Simon Wlliams' so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").
Part 1 was an email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that started the series. Part 2 was my response to a reader's email questioning our dismissal of Williams's claims. Part 3 was my email exchange with Williams where he provided his definition of a data model on which I conditioned any discussion with him and I debunked it. Part 4 is my response to a reader's comments on my previous posts in the series.
Saturday, October 29, 2022
NEW "DATA MODELS" 3 (t&n)
Note: "Then & Now" (T&N) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the evolution/progress of RDM, I am re-visiting my 2000-06 debunkings, bringing them up to my with my knowledge and understanding of today. This will enable you to judge how well my arguments have held up and appreciate the increasing gap between scientific progress and the industry’s stagnation, if not outright regress.
This is a re-published series of several DBDebunk 2001 exchanges on Simon Wlliams' so-called "Associative Model of Data" (AMD), academic claims of its superiority over RDM ("The Associative Data Model Versus the Relational model") and predictions of the demise of the latter ("The decline and eventual demise of the Relational Model of Data").
Part 1 was the email exchange among myself (FP), Chris Date (CJD) and Lee Fesperman (LF) in reaction to Williams' claims that started the series. Part 2 was my response to a reader's email questioning our dismissal of Williams's claims. Part 3 is my email exchange with Williams: he provided his "definition" of a data model on which I conditioned any discussion with him and I proved my point by debunking it.
Saturday, October 8, 2022
NOBODY UNDERSTANDS NORMALIZATION 1 (sms)
Note: In "Setting Matters Straight" posts I debunk online pronouncements that involve fundamentals which I first post on LinkedIn. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.
What's right/wrong with this database picture?
“Normalization in relational databases is a design process that minimizes data redundancy and avoids update anomalies. Basically, you want each piece of information to be stored exactly once; if the information changes, you only have to update it in one place. The theory of normal forms gives rigorous meaning to these informal concepts. There are many normal forms. In this article, we’ll review the most basic:
First normal form (1NF)
Second normal form (2NF)
Third normal form (3NF)
There are normal forms higher than 3NF, but in practice you usually normalize your database to the third normal form or to the Boyce-Codd normal form, which we won’t cover here.”
--Vertabelo.com
Saturday, June 11, 2022
ORDER & RELATIONAL DATABASES (sms)
Note: In "Setting Matters Straight" I post on LinkedIn online Q&As that involve fundamentals under the header "What's Right and Wrong with this Database Picture" and then debunk them here. The purpose is to induce practitioners to test their foundation knowledge against our debunking, where we explain what is correct and what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.
Q: “I'm not sure what this means: "The order of the rows and columns is immaterial to the DBMS?" -- could anyone explain?”
A: “It means two things:
The engine is under no obligation to insert new rows immediately following the previously inserted row(s)... During processing of selects, the optimizer is free to use any index it finds efficient to use or none at all... For this reason, if the order of returned data is important to your processing, then you must include an ORDER BY clause.”
Q: “How do you reorder fields in the database?”
A: “Depends on how you define "reorder". What view of your data are you trying to set the order. Are you in Table Design view? ... Are you looking at form? The answer is different depending on what you are referring to.”--Quora.com
Monday, April 25, 2022
RELATIONAL DATABASES & SET THEORY (sms)
Note: "Setting Matters Straight" is a new format: I post on LinkedIn an online Q&A involving data fundamentals that I subsequently debunk in a post here. This is to encourage readers to test their foundation knowledge against our debunking here, where we confirm what is correct and correct what is fallacious. For in-depth treatments check out the POSTS and our PAPERS, LINKS and BOOKS (or organize one of our on-site/online SEMINARS, which can be customized to specific needs). Questions and comments are welcome here and on LinkedIn.
Q: “To what extent is relational database theory related to set theory?”
A: “Relational database theory is indeed closely derived from set theory. Many operations in relational data are directly related to common operations one does with sets. In fact, SQL has keywords for them that should sound familiar to someone who has just taken a class in Discrete Mathematics:Even the structure of a table is set-oriented. A table is a set of rows, and a row is a set of columns, and those columns must match the set of columns defined in the table's header.”
- UNION
- INTERSECT
- DIFFERENCE (called MINUS in Oracle)
--Quora.com
Saturday, January 8, 2022
NO UNDERSTANDING WITHOUT FOUNDATION KNOWLEDGE PART 2: DEBUNKING AN ONLINE EXCHANGE 1 (obg)
Note: To demonstrate the soundness and stability conferred by a sound theoretical foundation (relative to the industry's fad-driven "cookbook" practices), I am re-publishing as "Oldies But Goodies" material from the old (2000-06) DBDebunk.com, so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. In re-publishing I may revise, break into or merge parts and/or add comments and/or references that I enclose in square brackets).
Friday, December 17, 2021
NO UNDERSTANDING WITHOUT FOUNDATION KNOWLEDGE PART 2: DEBUNKING A BOOK REVIEW (obg)
Note: To demonstrate the correctness and stability offered by a sound theoretical foundation (relative to the industry's fad-driven "cookbook" practices), I am re-publishing as "Oldies But Goodies" material from the old (2000-06) DBDebunk.com, so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references, which I enclose in square brackets).
Clarifications of a Review of My Book Part 1
“Many of us ... do not think that harmony is the great goal, or unity or peacefulness, [and] actually quite like hard questions for their own sake, and enjoy ... the life of the mind. To the question of how to live, the answer is "by disagreement.” --Christopher Hitchens
Let me say, first and foremost, that as the subtitle of the book -- A REFERENCE FOR THE THINKING PRACTITIONER -- indicates, it is targeted at the minority of practitioners who think clearly, independently and critically. It should not be a surprise, then, that those not belonging to that (alas, very small) target audience don't see its practical value. As I said so many times, if my work gained mass appeal, I would wonder what I was doing wrong. This is the sad reality, whether we like it or not. In fact, to be consistent I will go one step further: I don't assume that positive reviews are any better than negative ones -- they are frequently grounded in as faulty reasoning and/or ignorance as the critiques.
Let me also make clear that I do not place all of the blame on the individual database practitioners or users. Rather, problems are rooted in a systemic, much more profound societal and business culture that fails to instill and encourage foundation knowledge and independent, critical thinking, which not only does not reward, but actually punishes such. This is true to a degree in all societies, of course, but in the US the problem is much more acute (there can hardly be a better demonstration of the horrendous implications of this than how the election was covered, perceived and accepted by most of the press and public) [I wrote this prior to the last two elections -- I leave it to the reader to judge the steepness of the subsequent regress.]
Saturday, December 11, 2021
NOBODY UNDERSTANDS THE RELATIONAL MODEL: SEMANTICS, CLOSURE & CORRECTNESS PART 4
with David McGoveran
(Title inspired by Richard Feynman)
In Parts 1 and Part 2 we provided some clarifications following a discussion on LinkedIn about our contention that, conventional wisdom notwithstanding, database relations -- distinct from mathematical relations -- are by definition not just in 1NF, but also in 5NF, as a consequence of which the RA, as currently defined for 1NF closure, produces what the industry calls "update anomalies" and, thus, is not a proper algebra. In Part 3 we used that information to debunk some leftover misunderstandings in the discussion.
We conclude in Part 4 with comments on a private exchange that followed the public one on LinkedIn regarding the difference between the McGoveran (DMG) and Date and Darwen's (TTM)
interpretations of the RDM, which can be summarized as follows:
Friday, November 19, 2021
THE FATE OF FADS: XML DBMS (obg)
Note: To demonstrate the correctness and stability due to a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, I am re-publishing as "Oldies But Goodies" material from the old DBDebunk.com (2000-06), so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may revise, break into parts, and/or add comments and/or references.
Remember XML DBMS? At one point it was the fad of the day, similar to today's NoSQL or the old new "knowledge graph" -- "the future that you ignored at the peril of being left behind". As I predicted, it went the way of all fads (ODBMS, Associative DBMS, you name them) together with their "data models" that were nothing of the sort. My prediction was grounded in the same sound foundations I rely on today -- unlike the industry we are progressing it -- that fads lack and which were and still are dismissed, evidence be damned.
Here's a typical example (comments on republication in square brackets).
Thursday, November 11, 2021
Nobody Understands the Relational Model: Semantics, Relational Closure and Database Correctness Part 2
with David McGoveran
(Title inspired by Richard Feynman)
In Part 1 we explained that all database relations are, mathematically, relations, but not all relations are database relations, which are in both 1NF and 5NF and we agreed with a statement in a LinkedIn discussion ending as follows: "Update anomalies are not as big of a problem as an algebra where relations aren't closed under join". Unfortunately, update anomalies, closure, and how relational operators were defined are all interrelated and represent an even "bigger problem". Update anomalies are not "bugs", let alone irrelevant, but actually a reflection of that much bigger problem.
In this second part we delve into that problem.
Wednesday, October 27, 2021
Nobody Understands the Relational Model: Semantics, Relational Closure and Database Correctness Part 1
with David McGoveran
(Title inspired by Richard Feynman)
I recently posted the following Fundamental Truth of the Week on LinkedIn, together with links to more detailed discussions of 1NF and 5NF (see References):“As currently defined, relational algebra produces anomalies when applied to non-5NF relations. Since an algebra cannot have anomalies, they should have raised a red flag that RA was not defined quite right, especially defining "relation" as a 1NF table and claiming algebraic closure because 1NF was preserved. Being restricted to tabular representation as the "language" for relationships is like being restricted to arithmetic when doing higher mathematics like differential calculus -- you need more expressive power, not less! Defining RA operations in terms of table manipulations aided initial learning and implementations by making data management look simple and VISUAL. Unfortunately, it was never grasped how much was missing, let alone how much more "intelligent" the RA and the RDBMS needed to be made to fix the problems. And I can see that those oversights were, in part, probably due to having to spend so much time correcting the ignorance in the industry."--David McGoveran
“According to conventional understanding of the RDM (such as it is) [and I don't mean SQL], a relation is in at least first normal form (1NF) -- it has only attributes drawn from simple domains (i.e., no "nested relations") -- the formal way of saying that a relation represents at the logical level an entity group from the conceptual level that has only individual entities -- no groups thereof -- as members. 1NF is required for decidability of the data sublanguage.It triggered a discussion that raised some fundamental issues for which an online exchange is too limiting. This post offers further clarifications, including comments by David McGoveran, on whose interpretation of the RDM (LOGIC FOR SERIOUS DATABASE FOLKS, forthcoming) I rely on. The portions of my interlocutor in the discussion are in quotes.
However, correctness, namely (1) system-guaranteed logical validity (i.e., query results follow provably from the database) and (2) by-design semantic consistency (of query results with the conceptual model) requires that relations are in both 1NF and fifth normal form (5NF). Formally, the only dependencies that hold in a 5NF relation are functional dependencies of non-key attributes on the PK -- for each PK value there is exactly one value of every corresponding non-key attribute value. This is the formal way of saying that a relation represents facts about a group of entities of a single type.
Therefore we now contend that database relations are BY DEFINITION in both 1NF and 5NF, otherwise all bets are off.”
Sunday, September 19, 2021
TYFK: Calculated Attributes -- Redundancy, Full Normalization and Relational Theory
Note: Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can review references -- reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date -- which explain and correct the misconceptions. You can acquire further knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).
“If you have shopping cart, you probably have some field "TOTAL" somewhere that stores the final amount due for the customer. It so happens that such a thing violates relational theory...”
“Having a "TOTAL" field in your "order" table *might* violate relational theory, but if you make it so that only a trigger can update it based on what's in your "order_item" table, then I think it's fine. You still get data integrity and that is what matters.”
“I still fail to see what you mean by the "calculated TOTALS field" (attribute, really) violates the Relational Model.”
“The result of having the field ... is what is called a DELETE ANOMALY.”
“Most denormalizing means adding columns to tables that provide values you would otherwise have to calculate as needed.”
“There are four practical problems with a fully normalized database, three of which I have listed before. I will list them all here for completeness:
* No calculated values. Calculated values are a fact of life for all applications, but a normalized database lacks them. The burden of providing calculated values must be taken up by somebody somehow. Denormalization is one approach to this, though there are others.”--Database Programmer blog
Do calculated attributes (not fields!) violate relational theory and must be "normalized" out of them? Determining that requires foundation knowledge that is scarce in the industry, which has a poor and outdated understanding of the RDM.“...I'm now working with IT to normalize part of the database to remove calculated fields...:
`lineitems`.`extended total` = `lineitems`.`units` * `biditems`.`price`.
`jobs`.`jobvalue` = the sum of related `lineitems`.`extended total` records
`orders`.`ordervalue` = the sum of related `jobs`.`jobvalue` records.”--mySQL.com
Thursday, August 5, 2021
TYFK: Facts, Properties, Relationships, Domains, Relations, Tuples
Note: Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can review references -- reflecting the current understanding of the RDM, distinct from whatever has passed for it in the industry to date -- which explain and correct the misconceptions. You can acquire further knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).
A statement from a 1986 book that "Data are facts represented by values -- numbers, character strings, or symbols -- which carry meaning in a certain context" triggered the following response on Linkedin:“...In contrast, Date and Darwen (2000) say:To which the book author responded:Thus, the declarative sentence "Fred is in the kitchen." is a fact that links the domains Person[s] and Place[s] with the predicate "is in". The complete relation might be made up of three facts:
- Domains are the things that we can talk about.
- Relations are the truths we utter about those things.
This seems to be more precise than the 1986 statement.”
- Fred is in the kitchen.
- Mary is in the garden.
- Arthur is in the garden.
“...back then we did not have the refinement, clarity, nor precision from people like Sjir Nijssen and Terry Halpin regarding facts, or elementary fact sentences, which today you and I know are the bedrock of data modeling. Facts are expressed in sentences (with domains and predicates).”
Unfortunately none of this is sufficiently clear and precise to prevent confusion and it inhibits understanding of the RDM.
Friday, March 19, 2021
Data Sublanguages vs. Programming Languages
Revised 3/20/21
I recently came across a review of Edsger Dijkstra's work, where the following comment on a book he co-authored (referred to as D&S) raised my debunking antennae:
“... in general computer people seem to have a penchant for whipping up homebrew logics ... D&S is not the only example ... See E.F. Codd’s Relational Calculus, an obvious mess.”
--Maarten van Emden, A Bridge too Far: E.W. Dijkstra and Logic
Having recently argued that "Codd was wrong" and "You're teaching [his] gospel" Betray Lack of Foundation Knowledge, my suspicion should hardly surprise. Besides, criticism of Dijkstra is a very tall order in itself, particularly in the context of disputes among logicians). As a reader asked, "What’s so obviously messy in Codd’s Relational Calculus?". Answer:
Saturday, January 23, 2021
"Codd was wrong" and "You're teaching the gospel" Betray Lack of Foundation Knowledge
Note: I have documented and debunked these misconceptions so many times that I will no longer reference them -- the reader motivated to gain genuine understanding should use the (1) blog labels (2) Blogger search (3) POSTS page to locate the relevant posts.
I have long claimed that a core problem in the industry is the vast majority of practitioners who use relational terminology, do not know/understand what it means, yet are convinced they do -- the less the understanding, the greater the convinction. A recent LinkedIn exchange provided -- as if it were needed -- yet another example. It was triggered by my comment:
“How many know today that a relation is by definition in 5NF, otherwise it's not a relation, the relational algebra has "anomalies" and all bets are off? IMO, none! If you need to "do" normalization, you did not design correctly, which means you don't understand the RDM.”that prompted the following reaction:
“Is that really true? You construct a table and fill it full of garbage. It may not even be in 1NF, but is it not still a "relation" of columns, even if it's not a relation of rows or attributes? Codd had no real conception of syntax as separate from semantics, I don't think relational theory has a clear position on this. This is where Kimball and dimensional systems differ from Codd's relational, it made some effort (not a lot) to distinguish syntactic and semantic elements.”--Joshua Stern
Saturday, October 17, 2020
Understanding Codd's 12 Rules for RDBMS
In response to an online publication of a book appendix regurgitating Codd's 12 famous rules (some of which were, typically, incorrect[1]) I posted earlier a clarification of the rules. This is a revision thereof for better consistency with the new understanding of the RDM based on McGoveran's re-interpretation, extension and formalization[2] of Codd's work.
Saturday, September 26, 2020
TYFK: Nothing to Do With Relational
Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, which is based on the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can acquire the knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).
“The relational calculus is good in describing sets. But it´s bad at describing relations between data in different sets. Explicit identities (primary keys) need to be introduced and normalization is needed to avoid update inconsistencies due to duplication of data. To say it somewhat bluntly: The problem with the relational calculus and RDBMS etc. is the focus on data. It´s seems to be so important to store the data, that connecting the data moves to the background. That might be close to how we store filled in paper forms. But it´s so unlike how the mind works. There is no data stored in your brain. If you look at the fridge in your kitchen, there is no tiny fridge created in your brain so you can take the memory of your fridge with you, when you leave your kitchen.” --Weblogs.asp.net
The lack of foundation knowledge exposed by the above paragraph is so complete that its claims are practically upside down and backwards.
Fundamentals
As
we have demonstrated, in mathematical set theory a relation (set) is a
subset of a cross-product of domains (sets). In other words, it is a set
that is a relationship among sets. Being abstract (i.e., having no
real world meaning), the values of mathematical relations can be
arbitrary.
The RDM is an application of simple set theory
expressible in first order predicate logic (SST/FOPL) to database
management: a relational database represents a conceptual model of some
reality, namely (facts about) a multigroup in the real world -- a
collection of related entity groups -- each database relation
representing one such group; a database is also a set of related relations. The values in database relations (i.e., the
data) are, thus, not arbitrary, but must be consistent with the conceptual
model: relations and the database as a whole are semantically
constrainted to be so consistent: (1) individual properties of entities
and (2) collective properties of (a) groups (i.e., relationships among
entities within groups), and (b) the multigroup (i.e., relationships
among groups).
A primary key (PK) represents names given in the
real world to entities of a given type, and the corresponding PK
constraint (uniqueness) enforces consistency of a relation with the
distinguishability of those entities in the real world, the facts about which it represents. These are not
RDM artifacts, but rather part of the adaptation of SST/FOPL to database management.
For the primary advantage of
the RDM -- guaranteed correctness of query results (i.e., inferences
made from the database) -- to materialize, logical database design must
adhere to three core principles which, jointly, imply fully normalized
relations (5NF). In fact, in RDM relations are in 5NF by definition,
otherwise they are not relations -- relational algebra (RA) operations lose information and
all bets are off.
The RA is the manipulative
component of the RDM -- a collection of primitive and derived set
operations on relations that describe
relationships among relations. For example, the join operation r1 JOIN
r2 describes a relationship between r1 and r2 relation, the result itself a relation. Note that since every result of a RA operation on even one relation is always a relation and still describes a relationship -- between the "input" and "output" relations.
A data model -- and, industry claims notwithstanding,
the only one satisfying Codd's definition that has been formalized is the
RDM -- is by nature focused on data. However, the RDM supports
physical independence (PI) and, thus, not concerned with how data is
physically stored and accessed. The notion of "files stored in paper
form" is an example of the common and entrenched logical-physical
confusion (LPC) due to failure to understand the distinction between a
logical relation and its tabular visualization on a physical medium,
induced/reinforced by the industry's "direct image" implementation of
SQL DBMSs.
Conclusion
We rephrase the above paragraph as follows:
“The relational algebra describes relationships among relations (sets). Primary keys are one of the adaptations of the SST/FOPL for database management: a PK constraint -- uniqueness -- represents formally in the database a within-group relationship among all its entities.
Mandatory adherence to three core design principles jointly imply full normalization, which is necessary to guarantees correctness of query results. True RDBMSs:
- Implement the RA for logical data retrieval independent of how the data is physically stored and accessed. SQL DBMSs notwithstanding, vendors are free to store data whichever way they want as long as they don't expose it to users in applications.
- Enforce relational constraints that are formal database representations of relationships in the conceptual model represented by the database.”
The "brain" stuff is sheer nonsense.