Sunday, September 22, 2013

Site Update




1. Schedule reminder
September 23rd, 10:00am, San Francisco, CA
The CWA, Missing Data and the Last NULL in the Coffin
Presentation, Oaktable Conference, Oracle OpenWorld
October 8, Milan, Italy
Denormalization for Performance: A Costly Illusion
Public presentation, UGISS SQLSaturday
October 9-10, 2013, Milan, Italy
Business Modeling for Database Design
Private seminar sponsored by Microsoft and organized by SolidQ
Contact: Davide Mauri, SolidQ

2. Quote of the Week
I am constructing a new website ... using node.js. Its aim is to have many subscriber (people who offer help and people who need help) it should be scalable in different language. I have to decide wich is the more suitable db. I am thinking about to have two db (mongodb and postgress) for site languages and people account, people should vote other people ability. As db experts could you give me some suggestions? What would think could be a good db choice?
--LinkedIn.com

3. To Laugh or Cry?
Can anyone guide about using DB2

4. Online

 5. There were several posts on this site about Meijer article, its support by a letter to the editor and reactions by David McGoveran and C. J. Date to both. But I missed the one by my fellow relationlander Erwin Smout: A letter by Carl Hewitt. At one point he writes:
At any rate, I'm still left wondering what mr. Hewitt's problem is here.
I don't know why he wonders -- it is pretty obvious to me.


6. The frequency of fads has been increasing and the time between them decreasing. Today pushing a "new thing" starts before the last fad is exhausted: The Next Wave of Data Management


7. And now for something completely different: How the US Crushed Youth Resistance




Sunday, September 8, 2013

Site Update





1. Schedule reminder
September 23rd, 10:00am, San Francisco, CA
The CWA, Missing Data and the Last NULL in the Coffin
Presentation, Oaktable Conference, Oracle OpenWorld
October 8, Milan, Italy
Denormalization for Performance: A Costly Illusion
Public presentation, UGISS SQLSaturday
October 9-10, 2013, Milan, Italy
Business Modeling for Database Design
Private seminar sponsored by Microsoft and organized by SolidQ
Contact: Davide Mauri, SolidQ

2. Quote of the Week
Q: One of the main resistences of RDBMS users to pass to a NoSQL product are related to the complexity of the model: Ok, NoSQL products are super for BigData and BigScale but what about the model?

A: Actually graphs are the way we (people) think and organization data in our head, as computer people it is on[e] of the most popular way[s] we are taught to think about data, so this should be natural.
--slideshare.net

3. To Laugh or Cry?
"Splunk for Big Data"

4. My comment at Robert Young's blog
No Mas!! No Mas!!

5.
Something I argued much before they did.
Think Big Data Is All Hype? You're Not Alone

5. And now for something completely different.
High-tech toilets vulnerable to hackers
No comment.



Sunday, August 25, 2013

Site Update




1. Schedule update
September 23rd, 10:00am, San Francisco, CA
The CWA, Missing Data and the Last NULL in the Coffin
Presentation, Oaktable Conference, Oracle OpenWorld
October 8, Milan, Italy
Denormalization for Performance: A Costly Illusion
Public presentation, UGISS SQLSaturday
October 9-10, 2013, Milan, Italy
Business Modeling for Database Design
Private seminar sponsored by Microsoft and organized by SolidQ
Contact: Davide Mauri, SolidQ

2. Quote of the Week
How many software programs are mathematically provable. And yet everybody still writes software and for the most part it works. Relational theory and SQL was very important for establishing a standard across vendors to a point. And yet switching relational database vendors is still very expensive proposition because the standards don't address the features that users need and use everyday that are not part of the standard. At the heart of the system the relational model can still be enforced. But a product lives and dies not on whether it is mathematically provable but it's features set, efficiency and cost to develop in.
--LinkedIn.com

3. To Laugh or Cry?
Please help with my data model design
If this was student homework, it is an excellent example of how database management should not be learned and a validation of the substitution of the "cookbook approach" for education. Ironically it's in the forum's section "Relational theory". Had theory been taught, such questions would have not been asked. 


4. Two online exchanges I participated in
Predictable--it was just a matter of time. My latest post at All Analytics is quite apropos: Real Data Science: General Theories of Data.
In this context, consider In Silicon Valley, age can be a curse.


5. And now for something completely different

Not entirely unrelated:
Facebook boosts connections, not happiness study
The Curse of Self-Service (h/t Davide Mauri)




Sunday, August 11, 2013

Site Update




A while ago my friend Stephen Henley published his opinion on Missing Data, which questioned the thoughts--not well formed and definitive at the time--of C. J. Date, Hugh Darwen and myself on the subject. Since then Date has proposed a default values scheme which he has subsequently renounced; Darwen has published How To Handle Missing Information Without Using NULL and I proposed a relational solution in the recently revised paper #3, The Last NULL in the Coffin.

In this context, I dedicate this update (except the last item) to NULL. Whatever difference may exist among the above mentioned relational proponents, we do agree that it is certainly not a solution to the problems of missing data.

Time permitting, I may post some belated comments on Henley's piece.


1. QUOTE OF THE WEEK
If SQL is based on relational algebra which is based on set theory where the concept of null set (empty set) is an axiom of the theory. In this theory empty set is not the same thing as nothing. A point that confuses many people.

Relational algebra is based on 3VL predicates, that is, the answer to any predicate can have three states true, false or unknown. Unknown is caused by the use of a operator on an the absence of a value (null). Within relational algebra null is not to be treated as a value but merely a marker of unknown (absence of a value).

None of this is rocket science and I suggest doesn't result in bad implications. I suggest the so called "bad implications" are only introduced as people use null as a patch for problems for example the division by zero. indeterminate state, open ended ranges, data states to name a few. That is, the issue is not the concept of null but its abuse as a patch for other issues. 
--LinkedIn.com

2. TO LAUGH OR CRY?

Why shouldn't we allow NULLs?, stackexchange.com


3. An ONLINE exchange I participated in.

NULL Handling in Databases, LinkedIn.com


4. And now for something completely different.

An astonishing act of statistical chutzpah
Why Great Teachers Are Fleeing the Profession
The ABCs of MOOCs

What does this say about the educational system?




Tuesday, July 30, 2013

The Final NULL in the Coffin: A Relational Solution to Missing Data




Order via the PAPERS page


NEW! THE FINAL NULL IN THE COFFIN: A RELATIONAL SOLUTION TO MISSING DATA NEW!

v.3 (August 2013)

The relational data model is based on the two-valued logic (2VL) of the real world: every proposition about the real world is unequivocally true or false. But our knowledge of the real world is usually imperfect—some data is missing—which means that we don't always know whether propositions are true or not; 2VL no longer applies and data integrity and database query results are no longer guaranteed to be enforceable and provably logically correct with respect to the real world.

Missing data has possibly been the thorniest aspect of database management: without a logically sound yet practical solution, data professionals and users are left between a rock and a hard place. They must either (a) rely on SQL's arbitrary and flawed implementations of three-valued logic (3VL) based on NULLs and risk results that are easy to misinterpret, or erroneous in ways hard to discern, or (b) undertake in applications a prohibitively complex, error prone and unreliable burden that belongs in the DBMS.

This paper illustrates some of the drawbacks of the many-valued logic (nVL, n > 2) approach to missing data and SQL’s NULL scheme and proposes a solution within the 2VL/relational framework that:
  • Guarantees data integrity and logically correct query results;
  • Avoids the complications and problematics of nVL/NULL's;
  • Requires no changes to the relational model;
  • Is largely transparent to users;
  • Keeps users better apprised of the existence and effects of missing data.
The proposed solution requires research into its implications for data manipulation and integrity enforcements before it is implemented, but we believe it is theoretically sound and implementable in a truly relational DBMS (TRDBMS) using technologies that, unlike SQL, support full physical data independence e.g. the TransRelational™ Model (TRM).


Table of Contents
  • Introduction
  • "Inapplicable Data”: Nothing's Missing
  • Missing Data: Into the Unknown
  • SQL’s 3VL: NULL
  • Known Unknowns: Metadata
  • A 2VL Relational Solution
  • The Practicality of Theory
  • 2VL vs. NULL in the Real World
  • Relation Proliferation
  • The TransRelational™ Model
  • Conclusion
  • Some Misconceptions Debunked
  • References




Sunday, July 28, 2013

Site Update




1.
Some housekeeping. The posting to the blog and multiple static pages is a bit of a hassle. I am also facing some work on my seminars and papers. Until further notice:
  • There will be one post/week--alternating articles and Site Updates (I may skip the latter on certain weeks, if absolutely necessary);
  • Quotes and links to LAUGH/CRY? and FP ONLINE will be posted directly into Site Update posts (like below); the respective static Pages will be updated at the end of each month.
Some tool that would automate posts and updates in one shot would have helped. I looked into it, but for various reasons (including Google's Blogger updates), nothing is available (if you know of any, preferably from experienc, please recommend).

2.
Quote of the Week:
...the relational model has no relationships since Codd decreed that all relationships must be represented by foreign keys, which are exactly the same as "attributes" ... Consider if we had a bunch of tables, each containing the thing A. Now what is the population of A? It cannot be found in any one of the tables. It is actually the union of all the populations of A plus more if we allow A to exist (i.e., be of interest to us) but does not appear in any of the tables. That would be the case of a master reference list of "codes" for which we would then build a separate table. But even that is insufficient. We would also have to define and enforce referential integrity everywhere an A appeared. All of this is handled explicitly and correctly in ORM -- we model objects (each one appears only once in a data model diagram) and relationships. There are no attributes. As I said before, an attribute is an object playing a role in a relationship with another object.
--LinkedIn.com
3.
To Laugh or Cry?
What’s the Best Way for Structured Data Computing in Java?
4.
FP Online:
Let's innovate....database
5.
Good advice:
Designing a Database: 7 Things You don't Want To Do
But why it bothers me?

6.
And now for something completely different.
NSA claims inability to search agency's own emails
Clueless doctor sleeps through math class, reinvents calculus…and names it after herself. At least the doctor re-invented something in a different field. Data professionals do it all the time in their own field.
You can't make these things up.



View My Stats