Friday, September 4, 2020

OBG: Relationships and Relations




Note: To demonstrate the stability afforded by a sound foundation relative to the industry's fad-driven cookbook practices, I am re-publishing under "Oldies But Goodies" material from the old DBDebunk.com (2000-06), so that you can judge for yourself how well my debunkings hold up, and whether the industry has progressed beyond the misconceptions they were intended to dispel. I may break long pieces into more pointed parts, and add comments and references to further reading.

From "Little Relationship to Relational" originally posted on March 29, 2001.

“Given the depth and complexity of Codd's thought, not to mention the arcane terms in which he often expressed himself, it is not difficult to grasp why so many of his key points have been widely misunderstood. Even programmers still often misconstrue the technical term "relational". The relational in relational theory refers to relations and not relationships. A relation is a special set of similar objects commonly modeled as entities or as database tables. Relationships may exist between these relations and if your relations are entities you could easily represent the whole thing using a Relational Entity Relationship approach. To elucidate a simple practical example, if you had a company table and an employee table and each company record could have many employee records associated with it, you would have two relations and one relationship. The relations would be the sets of similar objects found in the Employee and Company tables and the relationship would be the association between them. In this case one company to many employees.”
Codd's thought was very deep indeed--new implications are still being derived from his original ideas--and one major objective of relational technology, now almost forgotten, is simplicity. There is little that is complex in relational technology and, in fact, it is the most simple approach possible. Any other general approach is more complex.

It is true that Codd, as a mathematician, did not present his ideas in a way comprehensible to the average practitioner. But it is also true that he had to use different terminology in order to distinguish his precise concepts from the fuzzy, problematic terms already used in the industry. It is also true that, as I argued in the first editorial launching this site, practitioners are so steeped in complex implementation details and devoid of education in fundamentals, that they have a hard time understanding simple logical concepts. It is rather ironic that the author of the article himself reveals some misunderstanding of his own. To clarify:

  • formally a relation is a set of tuples, representing propositions about the real world.
  • informally, a relational table can be viewed as representing an "entity type", with rows representing "entities" of that type.
But note carefully that:
  • "entity" has no precise, formal definition
  • "relationship" can and should be regarded as a special case of "entity"

Comments on re-publication: 
  • A relation is a relationship among domains that is constrained semantically to represent in the database real world relationships within and among entity groups. 
  • We no longer use R-table as a substitute for relation -- it is a visualization of a relation on some physical medium that plays no role in RDM. Note that constraints are not visible in a R-table.
  • A relationship can be (1) among entities within an entity group, in which case it is a collective property of the group and is represented by a constraint or (2) between groups, in which case it is represented by an associative relation.


Further Reading

The Interpretation and Representation of Database Relations (Codd 1969-70)

Logical Symmetric Access, Data Sublanguage, Kinds of Relations, Redundancy and Consistency (Codd 1969-70)

What Relations Really Are and Why They Are Important

Understanding Relations series

Levels of Representation: Conceptual Modeling, Logical Database Design and Physical Implementation

Understanding Conceptual vs. Data Modeling series

Conceptual Modeling Is Not Data Modeling

Relationships and the RDM series

Relations & Relationships

Relationships, Rules, Relations and Constraints


What Is A Database Relationship




Friday, August 28, 2020

TYFK: Denormalization Does Not Have Fundamentals



Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, which is based on the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can acquire the knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).
 
  ““Main Question: How do we trade-off while doing denormalization? 
  • Sub-question 1: the standard to implement
- Do we always have to denormalize a model? For what kind of project must we use denormalization techniques while others may not?
- Since denormalization has its gains and losses, how well should we denormalize a data model? Perhaps, the more complete we denormalize, the more complex, uncertain and poor the situation will be.
  • Sub-question 2: the characteristics of normalization
-Does denormalization have several levels/forms the same as that of normalization? For instance: 1DNF, 2DNF...
- Given we can denormalize a data model, it may never be restored to the original one because to do normalization, one can have many ways while to build a data model, you can have multiple choices in determining entities, attributes, etc.””

In Part 1 we discuss the relevant fundamentals in which we will ground the debunking in Part 2.

Thursday, August 20, 2020

TYFK: Relations, Tables, Domains and Normalization



Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, which is based on the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can acquire the knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).

“The most popular data model in DBMS is the Relational Model. It is more scientific a model than others. This model is based on first-order predicate logic and defines a table as an n-ary relation. The main highlights of this model are:

  • Data is stored in tables called relations.
  • Relations can be normalized. In normalized relations, values saved are atomic values.
  • Each row in a relation contains a unique value.
  • Each column in a relation contains values from a same domain.”

Friday, August 7, 2020

OBG: Data Models and Physical Independence




Note: To appreciate the stability of a sound foundation vs the industry's fad-driven cookbook practice, I am re-publishing some of the articles and reader exchanges from the old DBDebunk.com (2000-06), giving you the opportunity to judge for yourself how well my claims/arguments hold up and whether the industry has progressed at all. I am adding comments on re-publication where necessary. Long pieces are broken into smaller parts for fast reading.

From "Little Relationship to Relational" originally posted on March 29, 2001.

 
“E.F. ("Ted") Codd conceived of his relational model for databases while working at IBM in 1969. Codd's approach took a clue from first-order predicate logic, the basis of a large number of other mathematical systems and presented itself [sic] in terms of set theory, leaving the physical definition of the data undefined and implementation dependent. In June of 1970, Codd laid down much of his extensive groundwork for the model in his article, "A Relational Model of Data for Large Shared Data Banks" published in the Communications of the ACM, a highly regarded professional journal published by the Association for Computing Machinery. Buoyed by an intense reaction against the ad hoc data models offered by the physically oriented mainframe databases, Codd's rigid separation of the logical model, with its rigorous mathematical underpinnings, from the less elegant realities of hardware engineering was revolutionary in its day. Codd and his relational ideas blazed across the academic computing landscape over the next few years.”

Monday, July 20, 2020

OBG: Data Independence and "Physical Denormalization"




Note: I am re-publishing some of the articles and reader exchanges from the old DBDebunk (2000-06). How well do they hold up -- have industry knowledge and practice progressed? Judge for yourself and appreciate the difference between a sound foundation and the fad-driven cookbook approach.


January 2, 2001

  "... one of the "4 great lies" is "I denormalize for performance." You state that normalization is a logical concept and, since performance is a physical concept, denormalization for performance reasons is impossible (i.e., it doesn't make sense). What term would you use to describe changing the physical database design to be different from the logical design to enhance performance? Because normalization is a logical concept, you imply that this is not called denormalization."

Sunday, June 28, 2020

TYFK: Misconceptions About the Relational Model



“The most popular data model in DBMS is the Relational Model. It is more scientific a model than others. This model is based on first-order predicate logic and defines a table as an n-ary relation. The main highlights of this model are:
  • Data is stored in tables called relations.
  • Relations can be normalized, [in which case] values saved are atomic values.
  • Each row in a relation contains a unique value.
  • Each column in a relation contains values from a same domain.”

Each "Test Your Foundation Knowledge" post presents one or more misconceptions about data fundamentals. To test your knowledge, first try to detect them, then proceed to read our debunking, which is based on the current understanding of the RDM, distinct from whatever has passed for it in the industry to date. If there isn't a match, you can acquire the knowledge by checking out our POSTS, BOOKS, PAPERS, LINKS (or, better, organize one of our on-site SEMINARS, which can be customized to specific needs).
View My Stats