Showing posts with label Truth. Show all posts
Showing posts with label Truth. Show all posts

Monday, February 5, 2024

METALOGICAL PROPERTIES Part 2: Assertion Predicate



In Part 1 we introduced in the conceptual model (CM) the metalogical designation property. It represents—in the absence of known shared defining properties of an entity type, the designation by a group's definer that an entity identifier (aka assigned name) or property value is a member of the group. Such a group is not a group of entities, but a group of name and property values. In the logical model (LM), it is formalized as a designation predicate (DP) and defines a domain.

In Part 2, we introduce the metalogical assertion property. It represents the assertion by an authorized database user that a specific entity, represented by a tuple, either does or does not correspond to an actual entity in the real world.

Friday, November 5, 2021

OBG: Database Consistency and Physical Truth



Note: To demonstrate the correctness and stability due to a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, I am re-publishing as "Oldies But Goodies" material from the old DBDebunk.com (2000-06), so that you can judge for yourself how well my arguments hold up and whether the industry has progressed beyond the misconceptions those arguments were intended to dispel. I may slightly revise, break into parts, and/or add comments and/or references.

This is an email exchange with a reader responding to my third book.
(Originally posted on 06/21/2001)

“I'm presently reading your book PRACTICAL ISSUES IN DATABASE MANAGEMENT and there are a couple of points that I find a little confusing. I'll start first by saying that I have no formal database oriented education, and I'm attempting to familiarize myself with some of the underlying theories and practices, so that I can further my personal education and career prospects (but aren't we all!). My questions may sound a little bit ignorant, but that would be because I am! (Please note ignorant, not stupid!) I'll quote you directly from the book for this (possibly I'm taking you out of context or missing something important)

Chapter 3, A Matter of Identity: Keys, pg. 75: "Databases represent assertions of fact - propositions - about entities of interest in the real world. The representation must be correct - only true propositions (facts) must be represented."

Now, correct me if I'm wrong with a basic assumption here, but isn't a database simply a model of a "real world" data collection? I would've thought that the intention of a database would be to model real life effectively (and accurately) enough to provide useful data for interpretation. Now obviously this is not an easy process with complex data types, but would it even be possible to have a 100% true proposition with only atomic data types? (i.e. can a simplified model contain only facts?) In my understanding of modeling, any model that fits real life closely enough to be a good statistical representation is a usable model. e.g. Newton's Laws are accurate enough when applied on a local scale, but we need to use Einstein's model of space-time across larger scales. Wouldn't recording only "facts" (which I would presume you mean to be statements that are provable in the objective sense i.e. no interpretation, only investigation or calculation) possibly eliminate the utility of some aspects of the database? Or do we account for the interpretative aspect in the metadata or in some other way?

Essentially, I can see what you're saying, but not necessarily how you've reached the conclusion. Admittedly in an ideal world we should be able to record only facts in a database, but this is not an ideal world. As an example, in surveys we see such questions as "Are you happy with this product?" followed by a rating system of 1-5, or 'completely unhappy to completely happy'. This is an artificial enforcement of a quantitative measure on a qualitative property. How do we account for the fact that this is interpreted data and not calculated or measured?

My questions may have little relevance to database theory in general, but the concept fascinates me!”

Saturday, January 19, 2019

Data and Meaning Part 4: Query and Result Correctness




As we have seen in Parts 1, 2, and 3, the RDM is a formal theory adapted and applied to database management: database relations (1) preserve the formal properties of mathematical relations, but also (2) have interpretations -- carry a real world meaning assigned by a conceptual model: facts about entities, entity groups, and multigroups (i.e., their properties, some of which are relationships, specified by business rules (BR)). A relation is formally in 5NF and constrained for semantic consistency (i.e., to represent facts about an entity group).
“When we create specific domains, relations, and attributes we are constraining (restricting) an abstract logical system to a specific interpretation (meaning). Seen the other way around, an interpretation of the logical system is a representation of a specific segment of the world, and that is exactly the purpose of database design. For example, an attribute name created by the designer is assigned meaning intended by the modeler as representing an entity property, which is the very meaning of semantics. That is why full normalization cannot be achieved or assessed without reference to some conceptual model -- what attribute names mean, and how they are related to each other (i.e., their dependencies), and so on.” --David McGoveran
Yet requesting and giving design advice without a conceptual model is routine in the industry[1]. What is more, most practitioners are oblivious to the implications for correctness of queries and results[2].

Saturday, November 10, 2018

Conceptual Modeling Is Not Data Modeling



“Ok, now that we have those two (Parts 3 and 4 of your series) 'on the table' so to speak, perhaps you would address these questions...
1. Would it be safe to say that facts expressed in a Conceptual model should be verifiable in reality?

2. Are the following facts logically equivalent or are they different:

a) The car with license number 62-JZK-6 has the color aquamarine blue
b) De auto met kenteken 62-JZK-6 heeft de kleur aquamarijnblauw

3. If a previously true fact is found in reality to be verifiably false, would that mean the Conceptual model is wrong or the Logical model, or reality?”

“I'm going to add another:

4. How does RDM handle temporal changes to the 'truth' of statement 2a) when:

a) The owner of the car paints it black.
b) The owner of the license plate legally transfers it to a truck.
c) The owner of the car replaces every single part except the chassis.”

John O'Gorman asked me these questions in a LinkedIn exchange[1] in response to my comments in another exchange on modeling[2], where I alerted to the confusion of levels of representation common in the industry, particularly conceptual-logical conflation(CLC)[3]: calling conceptual modeling data modeling both reflects and induces it.

Online exchanges are not a proper vehicle for learning, particularly foundation knowledge. Which is why I publish free blog posts, and papers and books, to which to refer interested serious data professionals. It just so happened that my just posted four-part series covers the subject at hand[4], so I referred to it, as well as other writings (the answers are already there if one cares to read them). I will not discuss the whole exchanges -- read them and judge for yourself -- but I promised to answer the questions here, where I can do them justice.

John raises primarily conceptual, not data model issues -- the latter are subservient to decisions in the former -- but then asks "how does RDM handle..." From experience, I recognize implicit doubts that the RDM can. As far as we know there is no formal data model[5] that is a superior alternative to the RDM with respect to "handling" conceptual issues (in fact, there is no other formal data model -- i.e., that satisfies Codd's definition -- period).


Since most of the issues involved are covered by McGoveran's work in progress[6] (in which my multi-part series is rooted), to ensure consistency with it I passed the questions by him. As he too pointed out, "Answers that work in all situations require highly complicated discussions and lots of time, and trying to teach someone without proper experience and educational background would be very cumbersome, or an oversimplication via online exchanges." 


Here's what's possible within the constraints of a blog post -- the serious reader is referred to our writings.

Tuesday, September 11, 2018

RE-WRITE



See: https://www.dbdebunk.com/2018/09/designation-property-and-assertion.html

Friday, September 1, 2017

Don't Confuse/Conflate Database Consistency with Truth



Disregard for foundation knowledge and failure to learn from past mistakes by even data professionals deemed experts inhibit progress in data management and bring back problems already resolved that should be of foremost concern to data analysts. Consider the following:

"Above all else, we count on databases to reflect the truth consistently, or at least to reflect the table data perfectly. The database cannot be blamed when an application (or the end users of an application) place inaccurate data in its tables, but a database must accurately report the data it holds. Therefore, bugs are not all created equal; there are bugs, and there are wrong-rows bugs, bugs that silently misrepresent the data that the tables hold. Even the craziest, most obscure corner case that potentially misrepresents your data should rightly bring a loud chorus: "The emperor has no clothes!" We depend on the database, above all, not to lie."

View My Stats