“Fabian - With respect, maybe it's time to' shake the formal foundations' of data management, especially given the rising costs and increasing segregation of silos.”
“John, if I were to say what I really think, I would be accused of insulting, so I won't. You don't need to respect me, but you better respect formal foundations. Since they are what gives SOUNDNESS to data management practice, what you are really saying is that you don't care about soundness -- do you really intend to take this position? I would not be surprised, because the industry has long "shook" the formal foundations and lack of soundness is precisely what characterizes it. But because there is no longer proper education, practitioners are totally unaware of the relationship between formal foundations and soundness, everything is ad-hoc and arbitrary, yet they fail to recognize the consequences.”[1]Thus an exchange with John Gorman on LinkedIn, in which he posed several questions (that I answered in the last week's post[2]), the subject being the importance of not confusing levels of representation, and, more specifically, avoiding conceptual-logical conflation (CLC)[3].
--LinkedIn.com
Somebody posted a link to my answers on Linkedin and in a comment on it John linked to a Richard Feynman YouTube lecture on "the general differences between the interests and customs of the mathematicians and the physicists". To which I responded that my very point is that, just like physics is not the mathematics used to describe it (a central issue in quantum mechanics), conceptual modeling is not data modeling, the latter is the representation of the former in the database -- they are distinct[2]. This brought to mind some older columns I published on the All Analytics website that no longer exists, so this series is a revision thereof.
------------------------------------------------------------------------------------------------------------------
SUPPORT THIS SITE
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics
column. The site was discontinued in 2018. The content here is not available
anywhere else, so if you deem it useful, particularly if you are a regular
reader, please help upkeep it by purchasing publications, or donating. On-site
seminars and consulting are available.Thank you.
LATEST UPDATES
-12/24/20: Added 2021 to the POSTS page
-12/26/20: Added “Mathematics, machine learning and Wittgenstein to LINKS page
LATEST PUBLICATIONS (order from PAPERS and BOOKS pages)
- 08/19 Logical Symmetric Access, Data Sub-language, Kinds of Relations,
Database Redundancy and Consistency, paper #2 in the new UNDERSTANDING THE
REAL RDM series.
- 02/18 The Key to Relational Keys: A New Understanding, a new edition
of paper #4 in the PRACTICAL DATABASE FOUNDATIONS series.
- 04/17 Interpretation and Representation of Database Relations, paper
#1 in the new UNDERSTANDING THE REAL RDM series.
- 10/16 THE DBDEBUNK GUIDE TO MISCONCEPTIONS ABOUT DATA FUNDAMENTALS, my
latest book (reviewed by Craig Mullins, Todd Everett, Toon Koppelaars, Davide
Mauri).
USING THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or
acronyms of the terms listed on the FUNDAMENTALS
page. For detailed
instructions on how to understand and use the labels in conjunction with the
that page, see the ABOUT
page. The 2017 and 2016 posts,
including earlier posts rewritten in 2017 were relabeled accordingly. As other
older posts are rewritten, they will also be relabeled. For all other older
posts use Blogger search.
- The links to my columns there no longer work. I moved only the 2017 columns
to dbdebunk, within which only links to sources external to AllAnalytics may
work or not.
SOCIAL MEDIA
I deleted my Facebook account. You can follow me:
- @DBDdebunk on Twitter: will link to new posts to this site, as well as
To Laugh or Cry? and What's Wrong with This Picture? posts, and my exchanges on
LinkedIn.
- The PostWest blog for monthly samples of global Antisemitism – the
only universally acceptable hatred left – as the (traditional) response to the
existential crisis of decadence and decline of Western civilization
(including the US).
- @ThePostWest on Twitter where I comment on global #Antisemitism/#AntiZionism
and the Arab-Israeli conflict.
------------------------------------------------------------------------------------------------------------------
Mathematical relations are abstractions (i.e., devoid of any real world meaning), and, thus, can contain arbitrary data, and we can arbitrarily apply any operation of the relational algebra (RA) to them. For example, given the two relations A and B:100 26150 ...
110 38170 ...
120 37950 ...
130 33800 ...
140 35420 ...
150 30280 ...
160 27250 ...
290 15340 ...
310 15900 ...
... 100 06-19-1980 ...
... 110 05-16-1958 ...
... 120 12-05-1963 ...
... 130 07-28-1971 ...
... 140 12-15-1976 ...
... 150 02-12-1972 ...
... 160 10-11-1977 ...
... 290 05-30-1980 ...
... 310 09-12-1964 ...
some subset of Cartesian product of A with the projection of B on the second attribute --
all the possible combinations of each tuple of A with every tuple of the projection of B -- yields a relation, the attributes of which are those of A and the second attribute of B and the tuples of which are a subset of the tuples of the cross product. In mathematics the result is meaningless with respect to the real world.
But the RDM is applied relation theory: simple set theory (SST) expressible in first order predicate logic (FOPL) adjusted for applicability to database management. Database relations preserve mathematical properties, but -- distinct from mathematical relations -- are not abstract, but represent in the database sets of facts about real world entities identified during conceptual modeling:
- Tuples of base relations represent axioms about entities (facts assumed to be true);
- Tuples of RA derived relations represent theorems (i.e., logical conclusions inferred from the axioms);
- A DBMS and database constitute a logical inference (i.e., deduction) engine that derives theorems from axioms.
The data must be consistent with the conceptual model of reality intended by the modeler, which means that (1) neither the data in (2), nor the RA operations applicable to, database relations can be arbitrary -- both are constrained by conceptual modeling and mathematics of the SST. If A and B were database relations representing facts about employee compensations and project assignments:
COMPENSATIONS {EMP#,SALARY,...}the result of the above Cartesian product (combining each salary with every start date) wouldn't have a "sensible meaning", as a reader put it (i.e., the operation would not correspond to a meaningful query). As another commented, "Most of the real work in any query is planning out what you are asking, how you are asking it, and the meanings assigned." Which is another way of saying that users must understand the semantics (meaning) of the data specified in the conceptual model by the modeler/database designer!, in order to query the database meaningfully (who must model in accordance to user perceptions of the world).
ASSIGNMENTS {...,EMP#,START_DATE,...}
While it may be clear in this simple example that the operation makes no sense, this is often not the case in practice, as we shall demonstrate in Part 2.
References
[1] Software Wasteland How the Application-Centric Mindset is Hobbling our Enterprises.
[2] Pascal, F., Conceptual Modeling Is Not Data Modeling.
[3] Pascal, F., The Conceptual-Logical Conflation and the Logical-Physical Confusion.
[4] Pascal, F., What Relations Really Are and Why They Are Important.
[5] Pascal, F., What Meaning Means: Business Rules, Predicates, Integrity Constraints and Database Consistency.
[6] Pascal, F., Levels of Representation: Conceptual Modeling, Logical Design and Physical Implementation.
No comments:
Post a Comment