Sunday, April 17, 2016

This Week



1. What's wrong with this picture?

NoSQL database management systems give us the opportunity to store our data according to more than one data storage model, but our entity-relationship data modeling notations are stuck in SQL land. Is there any need to model schema-less databases, and is it even possible? --Theodore Hills, The Hybrid Data Model, Dataversity.net

Sunday, April 10, 2016

First Normal Form in Theory and Practice Part 2



09/19/23: For the latest on this subject see: FIRST NORMAL FORM - A DEFINITIVE GUIDE


Note: This is a 11/23/17 revision of Part 2 of a three-part series that replaced all of my previous posts on the subject (pages of which redirect here), in order to further tighten integration with the McGoveran formalization and interpretation [1] of Codd's true RDM.
(Cont'd from Part 1)
 

Part 1 raised the issue of poor understanding of Codd's concept of a simple domain with atomic values underlying 1NF. In Part 2 I clarify Codd's definition of 1NF and its correct interpretation.

Friday, March 25, 2016

Not Worth Repeating: Duplicates



My March post @All Analytics.

Frequent hits @dbdebunk.com are driven by the question “Are keys mandatory?” Puzzlingly, many data professionals do not seem to understand why duplicates should be prohibited. This should worry analysts. But  “Stating the same fact more than once, does not make it truer, only redundant,” as E. F. Codd used to say. The absence of an identifier means that individual entities are not meaningful, so this representation contradicts the real world. Contradictions produce problems. First, a DBMS is incapable of “visually” discerning a data entry duplication error from "valid" duplicates, which means high risk of inconsistent databases and wrong counts and other query results.

Read it all. (Please comment there, not here)







 






Sunday, March 20, 2016

This Week



1. QUOTE OF THE WEEK
Q: I am trying to remove duplicate rows from a database containing 4446 tables and when finished, merge data from that database to an existing database with the same structure. Is there an easy way to remove duplicate rows from all those 4446 tables?

A1: Please tell us "what is a 4446 table?" And, please give examples of duplicate records.

A2: Recreate your database with unique primary keys. 2) Copy records from original database to recreated database, one table at a time. The duplicate records will drop on the floor. 3) Fix the thousands of bugs in your application code that created the duplicate rows in the first place.

A3: I'm creating temporary tables + I add unique index to that table and insert data into and then I rename it to the original one.

--stackoverflow.com

Wednesday, March 9, 2016

The Fourth V -- Veracity



My February post @All Analytics.

"A fact is represented in the database not because it is categorically true, or categorically false, but only because somebody has asserted it, or has removed that assertion. We rely on people (and only on people!) for whatever a database system treats as true or false. When we authorize someone (or some application) to update the database, we accept their veracity, judgment, personal integrity, intelligence, comprehension, etc. insofar as the database is to have any utility. Nothing more can (or should) be said about truth and falsity." -- David McGoveran 

Read it all. (Please comment there, not here)



 




Sunday, March 6, 2016

This Week



1. Quote of the Week
[With] a declarative language and you have no real control over the execution plan. Heck, Oracle 12c can use multiple different execution plans for the same query depending on bind parameters, statistics not matching reality, and the phase of the moon.

No declarative language will ever be a first-class programming citizen in my eyes. Too much magic. Too non-linear. Too hard to debug for those reasons. They are great for when you have to do really simple things but once you step off the well worn path, you are in the thicket.
--reddit.com

View My Stats