Friday, January 24, 2014

Causality, Uncertainty & Actionability in Analytics



In analytics, it's much easier to "fish" for correlations and try to explain them post-hoc than to develop mutually exclusive hypotheses up-front and test empirically which one holds.Only the second approach is scientific, though, hence my skepticism about the hype of business analytics as "data science." 

Thursday, January 16, 2014

Weekly Update




1.  Quote of the Week
I do not understand your first point. Even a database designed with only 1NF can have integrity if other methods are used to ensure that integrity. [Higher n]ormal forms can guarantee the absence of various integrity issues, but the lack of a normal form does not guarantee the presence of the integrity issue. I remember writing pages of code to do just that in the early days of RDBMS products before normal forms (i.e., referential integrity) was strictly enforced by the DBMS. --LinkedIn.com
Note: My (first) point was that the minimal relational mandate is 1NF, but that full normalization (5NF) is desirable for practical reasons.


2. To Laugh or Cry?
What is difference between storing data in traditional and modern way in database?

3. Online
What is Surrogate Key, why it is used, is it Primary Key?
Apropos my just published paper on keys.


4. Data management is important, after all.
Point-of-sale malware infecting Target found hiding in plain sight
(Note the last sentence in the article).

So is database design
Poor Data Management Blinded Chase to Madoff Fraud

5. And now for something completely different
Looks like a pattern.



Wednesday, January 15, 2014

New: Paper #4 Published



Pls see PAPERS page for the current version of this paper, when it becomes available.



Sunday, January 12, 2014

Data Fundamentals and Education




Conveying the theoretical foundation of database management to practitioners without losing either precision/rigor, or the audience is a non-trivial endeavor, the skill for, or interest in which very few possess. More often than not one or the other are lost.

In Relations and Relationships I referred to the following LinkedIn exchange:
AT: ... William Kent and his book "Data and Reality" [1978] ... is devoted to "attributes", and (in my words) William confesses that he can not distinct [sic] between "relationships" and "attributes". Thus, the later might be completely redundant. 
GE: This confusion of entities vs. attributes has been with us a long time. Several years ago [a student] ... discovered a paper ... that discussed this dilemma. He proposed calling the thing, which we could not determine whether it was an entity or an attribute, an "entribute." Beautiful, eh? That was in 1939.

Sunday, January 5, 2014

Weekly Update




1.  Quote of the Week
As you might have gathered, I think NoSQL is technology to be taken very seriously. If you have an application problem that maps well to a NoSQL data model - such as aggregates or graphs - then you can avoid the nastiness of mapping completely. Indeed this is often a reason I've heard teams go with a NoSQL solution. This is, I think, a viable route to go - hence my interest in increasing our understanding of NoSQL systems. But even so it only works when the fit between the application model and the NoSQL data model is good ... And of course there are many situations where you're stuck with a relational model anyway. Maybe it's a corporate standard that you can't jump over, maybe you can't persuade your colleagues to accept the risks of an immature technology. In this case you can't avoid the mapping problem.
--martinfowler.com

2. To Laugh or Cry?
Which database type to use (one big database or many smaller)?

3. Online

Check out the last update's exchange--more comments have been posted since:
RELATIONAL DATABASE

4. Elsewhere
Set In Stone

5. And now for something completely different
Multivitamins Doctors Say Stop Taking Them
Half of U.S. adults take vitamins, supplements routinely
Draw your own conclusion.




View My Stats