I am working on entirely new papers (not re-writes) in the PRACTICAL DATABASE FOUNDATIONS series. I have already published two:
- THE FIRST NORMAL FORM - A DEFINITIVE GUIDE
- PRIMARY KEYS - A NEW UNDERSTANDING
available for ordering from the PAPERS page, and two more:
- RELATIONAL DATABASE DOMAINS: A DEFINITIVE GUIDE
- DATABASE RELATIONS: A DEFINITIVE GUIDE
are in progress and forthcoming, respectively.
In the process I am coming across industry common and entrenched "pearls" that I am using for my "Setting Matters Straight" (SMS) and "To Laugh or Cry" (TLC) posts on Linkedin. I do those posts to enable the few thinking database professionals left realize how scarce foundation knowledge is, and to illustrate fallacies that abound in the industry, of which they are unaware, and which the papers are intended to dispel.
Time permitting, I may expose and dispell some of those fallacies, treated in more depth in the papers, such that those thinking professionals can test their knowledge and decide whether the papers are a worthy educational investment.
Here's one:
“There seams to be some confusion between what a Primary Key is, and what an Index is and how they are used. The Primary Key is a logical object. By that I mean that is simply defines a set of properties on one column or a set of columns to require that the columns which make up the primary key are unique and that none of them are null. Because they are unique and not null, these values (or value if your primary key is a single column) can then be used to identify a single row in the table every time. In most if not all database platforms the Primary Key will have an index created on it. An index on the other hand doesn’t define niqueness. An index is used to more quickly find rows in the table based on the values which are part of the index. When you create an index within the database, you are creating a physical object which is being saved to disk.”
Can you identify the fallacies before you proceed?
------------------------------------------------------------------------------------------------------------------
DBDebunk was maintained and kept free with the proceeds from my @AllAnalitics column. The site was discontinued in 2018. The content here is not available anywhere else, so if you deem it useful, particularly if you are a regular reader, please help upkeep it by purchasing publications, or donating. On-site seminars and consulting are available.Thank you.
HOW TO USE THIS SITE
- To work around Blogger limitations, the labels are mostly abbreviations or
acronyms of the terms listed on the SEARCH page.
For detailed instructions on how to understand and use the labels in
conjunction with that page, see the ABOUT page. The 2017 and 2016 posts, incl uding earlier
posts rewritten in 2017 were relabeled accordingly. As other older posts are
rewritten, they will also be relabeled. For all other older posts use Blogger
search.
- The links to my AllAnalytics columns no longer work. I re-published only the
2017 columns @dbdebunk, and within them links to sources external to
AllAnalytics may or may not work.
SOCIAL
MEDIA
I deleted my Facebook account. You can follow me @DBDdebunk on X.
------------------------------------------------------------------------------------------------------------------
- Yes, there is confusion.
- Yes, PKs are logical objects.
- Columns play no role in the RDM, attributes do. Columns are just visualizations of attributes when database relations are presented in tabular format.
- A PK is an attribute that represents in the database a property in context--either an assigned entity name, or a set of identifying entity properties--and is constrained to have unique values for consistency with that representation.
- PKs cannot have missing values; NULL is a SQL-specific representation of missing values that violates RDM.
- Like columns, rows and tables play no role in RDM, tuples and relations do. Rows are just visualizations of tuples when relations are presented as tables.
- Uniqueness is a property of logical PKs representing a relationship among all entities in a group.
- Because PKs are commonly used for logical access to data, they are natural candidates for indexing to maximize performance of physical access. This has created a dependency of PK logical uniqueness on a specific physical implementation that induces confusion of the two.
- Some products expose this physical feature in the data sublanguage, violating RDM's physical data independence--insulation of applications and users from implementation.
No comments:
Post a Comment