Follow @DBDebunk
Follow @ThePostWest
My June post @All Analytics:
The data management industry operates like the
fashion industry. Its most persistent characteristic is migration from
fad to fad. Every few years -- the number keeps getting smaller -- some
"new" problem is discovered, for which the solution is so magical, that
it is extended everywhere to everything, whether it is applicable or
not. But many of these problems are old and fundamental and some of the “solutions” bring them back, rather than solve them. ...
Read it all. (Please comment there, not here)
Thursday, July 7, 2016
Saturday, July 2, 2016
This Week
Follow @DBDebunk
Follow @ThePostWest
1. What's wrong with this picture?
1. What's wrong with this picture?
AT: Well, I think I am a bit confused now. In my personal understanding, a relation is defined as a set of tuples. Then ... "in the relational model every relation represents a relationship". And then a quote from Chen: "each tuple of entities ... is a relationship". If I use the first and the second statements - I can say that a relationship is a set of tuples. The third statement says that a relationship is a tuple. So far, is a relationship a set of an element of a set? (Or may be a set of sets?)
GE: I argue that there is essentially no difference between relationships between entity (type tables) and between an entity and its attributes. They both represent relationships between two populations of things. Something is an attribute by virtue of there being a relationship. If relationships are represented by foreign keys and the entity tables must be in 1NF, as in the relational model, then all relationships must be at most Many-to-One (a very unnecessary limitation when modeling some user domain).
TF: The relational model was a mathematical construct, derived from set theory. Hence that particular terminology. The entity-relationship model is essentially a directed graph model, where relationships are prominent residents. Not so in the relational model (despite the name), where relationships (between relations, mind you) are not visible and in the SQL implementations is reduced to constraints. Relationships are about structure, which is as important as meaning (the semantics of the terms used in the universe being modeled).2. Quote of the Week
"In Relational Theory sometimes the relationships, where we do our Joins are much more important than the attributes on an Entity." (quoted in LinkedIn.com exchange)
Sunday, June 19, 2016
This Week
Follow @DBDebunk
Follow @ThePostWest
1. What's wrong with this picture?
This week's picture is the one of the state of knowledge about keys that Martijn Evers painted in Kinds of Keys: On the Nature of Key Classifications, that I had already commented on. As a result of discussions I've been having with David McGoveran in the context of our forthcoming books (his LOGIC FOR SERIOUS DATABASE FOLKS, my DBDEBUNK GUIDE TO FUNDAMENTAL DATA MANAGEMENT MISCONCEPTIONS), I've decided to rewrite my comments On Kinds of Keys on the subject. I refer the reader to Martijn's article for a refresher--my rewrite will be posted next week.
2. Quote of the Week
1. What's wrong with this picture?
This week's picture is the one of the state of knowledge about keys that Martijn Evers painted in Kinds of Keys: On the Nature of Key Classifications, that I had already commented on. As a result of discussions I've been having with David McGoveran in the context of our forthcoming books (his LOGIC FOR SERIOUS DATABASE FOLKS, my DBDEBUNK GUIDE TO FUNDAMENTAL DATA MANAGEMENT MISCONCEPTIONS), I've decided to rewrite my comments On Kinds of Keys on the subject. I refer the reader to Martijn's article for a refresher--my rewrite will be posted next week.
2. Quote of the Week
There are no rules of normalization for non-relational databases. Effectively, you start out by denormalizing everything. Which means you're designing the data organization to serve specific queries. So follow the same principle in NoSQL databases as you would for denormalizing a relational database: design your queries first, then the structure of the database is derived from the queries. --Bill Karwin, What is a good way to design a NoSQL database
Sunday, June 12, 2016
Levels of Representation: Conceptual Modeling, Logical Design and Physical Implementation
Follow @DBDebunk
Follow @ThePostWest
From last week:
What's wrong with this picture? (Kinds of Data Models, LinkedIn.com)
From last week:
What's wrong with this picture? (Kinds of Data Models, LinkedIn.com)
David Hay: "Part of the ... confusion as to what exactly was meant by “data modeling”--conceptual, logical or physical--is that most data modeling activities seem to focus on achieving good relational database designs ... my approach is the portrayal of the underlying structure of an enterprise’s data--without regard for any technology that might be used to manage it ... a “conceptual data model” ... that represents the business."Nothing raises uncertainty whether to laugh or cry better than attempts to dispel confusion which suffer from the very confusion they purport to dispel.
Sunday, June 5, 2016
This Week
Follow @DBDebunk
Follow @ThePostWest
1. What's wrong with this picture?
2. Quote of the Week
1. What's wrong with this picture?
David Hay: Part of the ... confusion as to what exactly was meant by “data modeling”--conceptual, logical or physical--is that most data modeling activities seem to focus on achieving good relational database designs ... my approach is the portrayal of the underlying structure of an enterprise’s data--without regard for any technology that might be used to manage it ... a “conceptual data model” ... that represents the business.
Nigel Higgs: ... many folks do not get the difference between the Barker entity relationship style of modeling and the relational style of modeling ... [because] the modeling conventions are very similar and the former [is always] a precursor to RDBMS design.
Clifford Heath: Any terminology for models must project three aspects of intention: (a) audience, (b) level of detail and (c) purpose. These three variables are sufficient to discriminate all the main kinds of models in use. The traditional terms of "conceptual/logical/physical" are manifestly inadequate.
Remy Fannader: Models are meant to describe sets of instances (objects or behaviors).
--Kinds of Data Models, LinkedIn.com
2. Quote of the Week
The first consideration that needs to be made when selecting a database is the characteristics of the data you are looking to leverage. If the data has a simple tabular structure, like an accounting spreadsheet, then the relational model could be adequate. Data such as geo-spatial, engineering parts, or molecular modeling, on the other hand, tends to be very complex. It may have multiple levels of nesting and the complete data model can be complicated. Such data has, in the past, been modeled into relational tables, but has not fit into that two-dimensional row-column structure naturally. --Jnan Dash, RDBMS vs. NoSQL: How do you pick?
Sunday, May 29, 2016
Wednesday, May 25, 2016
Why Data Scientists Must Understand Normalization
Follow @DBDebunk
Follow @ThePostWest
Read it all. (Please comment there, not here)
My May post @All Analytics:
We are constantly told how data scientists
must be “jacks of many skills”, but one of the most important is rarely
included in the list.Very few databases are properly designed. Many SQL databases are
denormalized inadvertently, or intentionally (and erroneously) "for performance". They
require special constraints to control data redundancy and prevent
inconsistencies, which are practically never enforced. Analysts cannot,
therefore, take database consistency for granted. Furthermore, to issue
sensible queries and ensure correct results and interpretation thereof,
it’s not enough for analysts to know the types of fact represented in
the database, but also whether and how the database designer has chosen
to bundle -- nest or merge -- those facts and how to disentangle them
for analysis.
Read it all. (Please comment there, not here)
Subscribe to:
Posts (Atom)