Re-write 10/16/18
“E/RM
is a data model -- So says Date, Chen, etc. So says the majority of
current industry experts ... With very strong references to Codd (who he
worked with), Date elegantly explains the differences between RM and
E/RM -- but clearly believes both are data models (even allowing for the
charitable comment). If we take a RDB as the ultimate target
implementation of data, and an E/RM (or extended) can correctly design
all the artifacts that are implemented, this means it is modeling the
data. Granted, an E/RM does not explicitly model some of the
non-structural aspects of the original Codd definition.”
“Out of
interest, is there a common Relational Modeling tool, that is not also
an E/RM tool and models the full Codd definition? There are also several
other methods of modeling data -- E/RM is more a mechanism to represent
the data. If E/RMs are used by IT professionals across the world to
direct the design and build of the majority of applications guided by
standard methodologies, is the view of this argument that these were all
build wrongly? Regardless of success? Is the inferred conclusion that
only the RM models data, and ERM, [or] any other techniques do not? [If
so] that is a little limiting.”
Objects, Properties, and Ontological Commitment
We
are culturally and linguistically conditioned to conceptualize the
world as objects with properties. Objects in a universe thereof that
share common properties are of the same type and form a class,
distinguishing them from objects that are not and do not. Applying a
class definition to the universe selects out the group of objects of
that type from the universe.
Philosophical ontology is the study
of being, existence, reality, as well as the basic categories of being,
and their relationships -- what entities exist or may be said to exist,
and how they may be grouped, related, and subdivided according to
similarities and differences.
Note:
'Object' is used in the general, not OO sense. Ontology, as used
herein, should not be confused with "computer science ontology", whereby
the term ontology was usurped, and is understood by programmers as
meaning a conceptual graph of directed semantic relationships among
objects (and only sometimes among object types).
Conceptual modeling (1) identifies types of objects of interest, and (2) formulates business rules (BR) that specify their properties and relationships and, as such, makes an ontological commitment.
Any approach to conceptual modeling must consider the ontological
commitment upon which it is based, which has major implications for the data model used to formalize conceptual models as logical models for computable database representation -- it must be consistent with that commitment.
Unfortunately, due to lack of foundation knowledge in the industry[1],
practitioners -- both vendors and users -- are largely unaware of, and
oblivious to ontological underpinning and their implications for
database technology and practice, one reason why they not only
stagnated, but regressed in the last five decades. In this multipart
series we explain the important distinction between conceptual, and data modeling
(aka logical database design), which requires a formal data model. The
E/RM is not, and while it can be used for conceptual modeling of reality, not data,
we outline a new conceptual modeling approach that makes a different
ontological commitment and requires adjustments to the RDM, both
necessary for genuine progress.
1. Database Truth of the Week
"Logical (aka. syntactic) validity: Every logical inference (application of any number of rules of inference) from syntactically valid and well-formed premises yields well-formed and syntactically valid consequents (i.e., the rules of inference preserve syntactic validity and well-formedness.) This property is independent of any interpretations of the symbols. A query result is said to be logically valid iff it is derived by any sequence of RA operations on one or more relations.
"Semantic correctness: Every interpretation of the symbols (meaning and truth value assignment) that makes the axioms true, makes the theorems true. When we extend a logical model with semantics (specific to the subject matter and its "business" rules) via constraints, those constraints become axioms that must be true. A query result is said to be semantically correct iff for every assignment of meaning to relations that are the RA operands under which their tuples represent true facts, the tuples of the result relations also represent true facts.
If relations are not in 5NF, query results aren’t guaranteed to be semantically correct. Any update anomaly can appear in a result -- a join might deliver extra tuples that are anomalous. A database design that permits update anomalies does not preserve semantic correctness!" --David McGoveran
2. What's Wrong With This Database Picture?
"If we take a RDB as the ultimate target implementation of data, and an ERM (or extended) can correctly design all the artifacts that are implemented, this means it is modelling the data. Granted, an ERM does not explicitly model some of the non-structural aspects of the original Codd definition.
"ERM is a data model -- So says Date, Chen, etc. So says the majority of current industry experts. Refer to Date 6th edition p347. With very strong references to Codd (who he worked with), Date elegantly explains the differences between RM and ERM – but clearly believes both are data models (even allowing for the charitable comment).
Out of interest, is there a common Relational Modelling tool, that is not also an ERM tool and models the full Codd definition? There are also several other methods of modeling data -- ERM is more a mechanism to represent the data. If ERMs are used by IT professionals across the world to direct the design and build of the majority of applications guided by standard methodologies, is the view of this argument that these were all build wrongly? Regardless of success? Is the inferred conclusion that only the RM models data, and ERM, [or] any other techniques do not? [If so] that is a little limiting." --LinkedIn.com
One of the clearest indications of poor foundation knowledge in data
management practice is misuse and abuse of terminology. Many data
professionals are inducted into the industry without a formal education,
via programming and software tools, and use terms indiscriminately, as
jargon, without understanding them. This has produced weak DBMS
implementations and poorly designed databases that put the correctness
of databased analytics at risk).
Revised 10/31/17.
Note: Posts starting with this one will be consistent with the TERMINOLOGY page. Fundamental terms -- the grasp of which is necessary for data management practice -- will be boldened. When you encounter one you don't understand, better find out what it means, chances are it's being misused or abused. Once the page is finalized, labels and, time permitting, old posts may also be revised accordingly.
Reference [9] is an important rewrite and is recommended pre-requisite
for this post that you should read first.
Here's what's wrong with the picture of three weeks ago, namely:
"The term database design can be used to describe many different parts of the design of an overall database system. Principally, and most correctly, it can be thought of as the logical design of the base data structures used to store the data. In the relational model these are the tables and views. In an object database the entities and relationships map directly to object classes and named relationships. However, the term database design could also be used to apply to the overall process of designing, not just the base data structures, but also the forms and queries used as part of the overall database application within the database management system(DBMS).
The process of doing database design generally consists of a number of steps which will be carried out by the database designer. Usually, the designer must:
- Determine the data to be stored in the database.
- Determine the relationships between the different data elements.
- Superimpose a logical structure upon the data on the basis of these relationships.
Within the relational model the final step above can generally be broken down into two further steps, that of determining the grouping of information within the system, generally determining what are the basic objects about which information is being stored, and then determining the relationships between these groups of information, or objects." --Halil Lacevic, What is a Relational Database?, Quora.com
Many problems in database practice are due to failure to grasp what a data model is and the important distinctions between DBMS functions on the one hand and application functions on the other.
The three design steps above are vague, somewhat confused and obscure more than enlighten. They do not reflect the fact that database design is formalization of a conceptual model
of reality as relations constrained to be consistent with the business rules the model consists of.
1. Database Truth of the Week
"The original normal form and the later First Normal Form (1) are distinct. In the early 1969 RDM there was only "the normal form" of relations [a term Codd borrowed from FOPL]. It was based on the initial version of the join operation, which was different than today's join. Had 1NF and further normalization to at least 2NF had been introduced then, the normal form would have made no sense, as there would have been then multiple normal forms, which make sense only with the post-1970 join definition currently in use. Thus, there is no way to answer "what is the difference between the original normal form and 1NF?" without taking into account the definition of join, and -- if defined as we now do -- no way to understand the original normal form, except to say that in the context of the original join definition it would correspond to today's Fifth Normal Form (5NF). This is why a relation is really in 5NF by definition, not in 1NF as per current understanding." --David McGoveran
2. What's Wrong With This Database Picture?
"The term database design can be used to describe many different parts of the design of an overall database system. Principally, and most correctly, it can be thought of as the logical design of the base data structures used to store the data. In the relational model these are the tables and views. In an object database the entities and relationships map directly to object classes and named relationships. However, the term database design could also be used to apply to the overall process of designing, not just the base data structures, but also the forms and queries used as part of the overall database application within the database management system(DBMS).
The process of doing database design generally consists of a number of steps which will be carried out by the database designer. Usually, the designer must:
- Determine the data to be stored in the database.
- Determine the relationships between the different data elements.
- Superimpose a logical structure upon the data on the basis of these relationships.
Within the relational model the final step above can generally be broken down into two further steps, that of determining the grouping of information within the system, generally determining what are the basic objects about which information is being stored, and then determining the relationships between these groups of information, or objects." --Halil Lacevic, What is a Relational Database?, Quora.com
1. Database Truth of the Week
“A DBMS using the RDM for all its functionality would be very limited. The RDM only requires that the declarative data sub-language employed by users for data manipulation -- has power not more expressive than first order predicate logic (FOPL), which implies acceptance of certain limitations on what users can do directly in the language, in return for
Language declarativity and decidability;
Semantic correctness and system-guaranteed logical validity;
Physical and logical independence;Simplicity.”
--David McGoveran
2. What's Wrong With This Database Picture?
"The term database design can be used to describe many different parts of the design of an overall database system. Principally, and most correctly, it can be thought of as the logical design of the base data structures used to store the data. In the relational model these are the tables and views. In an object database the entities and relationships map directly to object classes and named relationships. However, the term database design could also be used to apply to the overall process of designing, not just the base data structures, but also the forms and queries used as part of the overall database application within the database management system(DBMS).
The process of doing database design generally consists of a number of steps which will be carried out by the database designer. Usually, the designer must:
- Determine the data to be stored in the database.
- Determine the relationships between the different data elements.
- Superimpose a logical structure upon the data on the basis of these relationships.
Within the relational model the final step above can generally be broken down into two further steps, that of determining the grouping of information within the system, generally determining what are the basic objects about which information is being stored, and then determining the relationships between these groups of information, or objects."
--Halil Lacevic, What is a Relational Database?, Quora.com