Wednesday, February 26, 2014

Anatomy of a Data Management Project: Distribution Independence



The term "distributed" is thrown around a lot these days. Hype notwithstanding, just as with analytics and data science, distribution in data management is nothing new.In fact, SQL vendors (IBM, Sybase, Ingres, Oracle) -- frequently criticized today for non-scalability -- tackled distribution decades ago. The non-relational systems preceding SQL were not amenable to it, and SQL is the closest to the relational model the industry allows you to get. 

Sunday, February 23, 2014

Thinking Logically: SQL, NoSQL and the Relational Model




I very much doubt that somebody who does not think logically can be a fully competent database professional. From a LinkedIn exchange, an attempt to "address some of my points by calling them out":
JL: You said: "... much of the underlying motivation of NoSQL stuff is anti-relational ..." Okay, so? Data management in graph/document/columnar dbs is not possible because they are "anti-relational"?
The point I made was that, name notwithstanding, NoSQL vendors/proponents are not just anti-SQL, they are actually anti-relational, an important difference.
  • What exactly does this have to do with whether graph/documents/columnar database systems "are possible or not"?
  • Columnar DBMS's can be relational and are not considered NoSQL products.
  • At issue is not the possibility of graph/document systems, but what they are appropriate for.

Sunday, February 16, 2014

Weekly Update




1. Quote of the Week
A view is a logical table based on one or more tables or another view.View can be thought as a virtual table which takes the output of a query and stores it ... A View can be based on a table or another view.--dwhlaureate.blogspot.in

2. To Laugh or Cry?
Will physical modelling and normalization make sense in next decade?

3. Online
Database design patterns

4. Elsewhere

I was reading the following
How Edward Snowden went from loyal NSA contractor to whistleblower
which is interesting in itself, when I came across this:
In mid-2006, Snowden landed a job in IT at the CIA. He was rapidly learning that his exceptional IT skills opened all kinds of interesting government doors. "First off, the degree thing is crap, at least domestically. If you 'really' have 10 years of solid, provable IT experience… you CAN get a very well-paying IT job," he wrote online in July 2006.
Should be familiar to my readers.


5. And now for something completely different  
Who says Congress is not representative?

Martin Kramer is, IMO, possibly, the most astute analyst of the Middle-East: 





Monday, February 10, 2014

"Denormalization for Performance": Don't Blame the Relational Model



 REVISED: 10/18/16

Many common misconceptions are excellent indicators of poor grasp, if any, of  the relational data model (RDM). One of the most entrenched is the notion of "denormalization for performance".

I will not get into the first four of the 5 Claims About SQL, Explained. I do not disagree with the facts, except to point out that the problems are not due the relational nature of SQL and its implementations, quite the opposite: it is due to their poor relational fidelity. I will focus on the fifth, "Should everything be normalized?"

Sunday, February 2, 2014

Weekly Update (Revised 2/3)




1. Quote of the Week
Observe the trend of NoSQL growth, revenue will trail along irrespective. In the industry, only with respect to non-OLTP applications, RDBMS is in "keep the lights on" state by necessity, it is either awaiting obsolescence/end of life, or replacement with NoSQL solution; no longer a "workhorse" - this was the discussion point. --LnkedIn.com

2. To Laugh or Cry?
Why semantic models like RDFOWL, TMDM, are not sufficient for the web of linked data

3. Online
Why don’t RDBMS products support sub-typing?

4. Elsewhere
Cisco unveils 'fog computing' to bridge clouds and the Internet of Thing
Could not have thought of a better name!


5. And now for something completely different 

Friday, January 24, 2014

Causality, Uncertainty & Actionability in Analytics



In analytics, it's much easier to "fish" for correlations and try to explain them post-hoc than to develop mutually exclusive hypotheses up-front and test empirically which one holds.Only the second approach is scientific, though, hence my skepticism about the hype of business analytics as "data science." 

Thursday, January 16, 2014

Weekly Update




1.  Quote of the Week
I do not understand your first point. Even a database designed with only 1NF can have integrity if other methods are used to ensure that integrity. [Higher n]ormal forms can guarantee the absence of various integrity issues, but the lack of a normal form does not guarantee the presence of the integrity issue. I remember writing pages of code to do just that in the early days of RDBMS products before normal forms (i.e., referential integrity) was strictly enforced by the DBMS. --LinkedIn.com
Note: My (first) point was that the minimal relational mandate is 1NF, but that full normalization (5NF) is desirable for practical reasons.


2. To Laugh or Cry?
What is difference between storing data in traditional and modern way in database?

3. Online
What is Surrogate Key, why it is used, is it Primary Key?
Apropos my just published paper on keys.


4. Data management is important, after all.
Point-of-sale malware infecting Target found hiding in plain sight
(Note the last sentence in the article).

So is database design
Poor Data Management Blinded Chase to Madoff Fraud

5. And now for something completely different
Looks like a pattern.



View My Stats