Tuesday, April 26, 2016

Re-write



My April column @All Analytics.

A R-table with attributes defined only on simple domains takes a less convoluted form -- a normal form -- devoid of nesting. If R-tables are in the preferred normal form i.e., components meaningful to applications (here, employee attributes) are simple domains in their own right and a true RDBMS enforces value atomicity -- first order logic is sufficient. This imposes some limitations on the expressive power of data languages, but they are declarative and PDI and simplicity are preserved. A true RDBMS enforces atomicity via a data language that does not allow applications to access attribute components not explicitly defined on their own domains.

Read it all. (Please comment there, not here)

FYI: I have revised all three parts of the series on 1NF -- mainly refinements and clarifications.









Tuesday, April 19, 2016

First Normal Form in Theory and Practice Part 3



09/19/23: For the latest on this subject see: FIRST NORMAL FORM - A DEFINITIVE GUIDE


Note: This is a 11/23/17 revision of Part 3 of a three-part series that replaced all of my previous posts on the subject (pages of which redirect here), in order to further tighten integration with the formalization and interpretation [1] of McGoveran's formalization and interpretation [1] of Codd's true RDM.

(Continued from Part 2)

 
"Is this table in 1NF?" is a common question in database practice. On the other hand, "What problems are solved by splitting street addresses into individual columns?", or  
What's the best way to store an array in a relational database does not seem to evoke associations with 1NF. This reveals poor foundation knowledge.


Part 1 introduced the poor understanding of 1NF and Part 2 provided a correct definition and explanation. Part 3 explains how 1NF can be enforced by the data sublanguage, which SQL does not.

Sunday, April 17, 2016

This Week



1. What's wrong with this picture?

NoSQL database management systems give us the opportunity to store our data according to more than one data storage model, but our entity-relationship data modeling notations are stuck in SQL land. Is there any need to model schema-less databases, and is it even possible? --Theodore Hills, The Hybrid Data Model, Dataversity.net

Sunday, April 10, 2016

First Normal Form in Theory and Practice Part 2



09/19/23: For the latest on this subject see: FIRST NORMAL FORM - A DEFINITIVE GUIDE


Note: This is a 11/23/17 revision of Part 2 of a three-part series that replaced all of my previous posts on the subject (pages of which redirect here), in order to further tighten integration with the McGoveran formalization and interpretation [1] of Codd's true RDM.
(Cont'd from Part 1)
 

Part 1 raised the issue of poor understanding of Codd's concept of a simple domain with atomic values underlying 1NF. In Part 2 I clarify Codd's definition of 1NF and its correct interpretation.

Friday, March 25, 2016

Not Worth Repeating: Duplicates



My March post @All Analytics.

Frequent hits @dbdebunk.com are driven by the question “Are keys mandatory?” Puzzlingly, many data professionals do not seem to understand why duplicates should be prohibited. This should worry analysts. But  “Stating the same fact more than once, does not make it truer, only redundant,” as E. F. Codd used to say. The absence of an identifier means that individual entities are not meaningful, so this representation contradicts the real world. Contradictions produce problems. First, a DBMS is incapable of “visually” discerning a data entry duplication error from "valid" duplicates, which means high risk of inconsistent databases and wrong counts and other query results.

Read it all. (Please comment there, not here)







 






Sunday, March 20, 2016

This Week



1. QUOTE OF THE WEEK
Q: I am trying to remove duplicate rows from a database containing 4446 tables and when finished, merge data from that database to an existing database with the same structure. Is there an easy way to remove duplicate rows from all those 4446 tables?

A1: Please tell us "what is a 4446 table?" And, please give examples of duplicate records.

A2: Recreate your database with unique primary keys. 2) Copy records from original database to recreated database, one table at a time. The duplicate records will drop on the floor. 3) Fix the thousands of bugs in your application code that created the duplicate rows in the first place.

A3: I'm creating temporary tables + I add unique index to that table and insert data into and then I rename it to the original one.

--stackoverflow.com

View My Stats