Tuesday, May 26, 2015

R-table Constraints and Data Science



My May post @All Analytics.

I’ve often expressed skepticism here and elsewhere about “data science” as currently used and hyped. Science is about development, testing, and application of theories. Data science is about general theories of data. For example, "relational theory" is the application of logic and set theory to database management to guarantee provably logically correct data analysis results, yet it is absent from the list of desirable skills for “data scientists”.

Read it all. (Please comment there, not here)



 








New Versions of All 6 Papers




I have just posted descriptions of all new versions of all six papers in the PRACTICAL DATABASE FOUNDATIONS Series:

#1: Business Modeling for Database Design
#2: The Costly Illusion: Normalization, Integrity and Performance
#3: The Last NULL in the Coffin: A Relational Solution to Missing Data
#4: The Key to Keys: A Matter of Identity
#5: Truly Relational: What It Really Means
#6: Domains: The Database Glue

The changes are significant and there are a few error corrections.

Since these are new versions, not revisions, the following applies:

  • Those who ordered in 2015 get free copies.
  • Those who ordered in 2014 get a 50% discount.
Please email me with proof of purchase.

For more details and how to order see PAPERS page.









Sunday, May 17, 2015

Weekly Update



1. Quote of the Week
He started his SQL Server career when he debuted as an accidental DBA in 2005.  Seeing Reporting Services 2005 demoed for the first time sealed the deal, and it has been all data ever since, leaving the worlds of networking and systems admin behind. After being a full-time dev/operational DBA with everything since SQL 2000, he is now back to BI, as a Senior BI Engineer/Consultant. --Online Bio
2. To Laugh or Cry?

3. Online Debunkings


4. Interesting Elsewhere

Obfuscated SQL Contest Winners!
H/t Todd Everett.  

5. And now for something completely different

Saturday, May 9, 2015

On OO Relational "Extensions"



In a LinkedIn thread that followed my Comments On Stonebraker Interview, Erwin Smout mentioned David Maier's 1991 critique of the 1990 Third Generation Data Base System Manifesto (3GM), of which Stonebraker was one of the authors. I was aware of the 3GM, of course, but had not read it because, at the time, it did not benefit from favorable reviews. I considered The Third Manifesto by Date and Darwen more significant, in part because it was authored by relational experts and because it was backed up by a proposed fully computational language with a fully relational component. But when Erwin mentioned Maier's piece, I asked him if he had a copy and he found a scanned PDF copy online.

Having not read the 3GM, I am not in a position to comment on Maier's critique thereof, but I would like to comment on the general topics in his Preliminaries that attracted my attention.

Saturday, May 2, 2015

Weekly Update



Housekeeping: I have added the following:

1. Quote of the Week

I am new to this domain. Please guide me to choose which database to choose among the NOSQL databases. Also which OS the database supports and how to add data to the database(which language). The requirement is to store pictures and alpha numeric s in database. A web server would be designed to extract data from the database and display in web application. The important requirement is scalability so I explored and found that NoSQL database will best fit the requirement. --LinkedIn.com
CJ Date calls this "I don't know how to do my job and am looking for somebody to do it for me."

2. To Laugh or Cry?

Docbase, Graphbase, Colubase, Triplestore ,which better fo RDF triples
3. Online Debunkings
4. Interesting Elsewhere
5. And now for something completely different

Sunday, April 26, 2015

Comments on Stonebraker Interview




Revised: 12/2017

Interviewed about his Turing Award, Michael Stonebraker is "modest" about his jointly-with-others contribution:

"... the Ingres database [sic] brought Codd’s lofty relational ideas into the realm of ordinary individuals ... turned [them] into constructs that could be manipulated by ordinary people ... it was argued at the time that RDBMS couldn’t perform, but we showed it could be efficient."
and gives most of the credit to "Ted" Codd:
"What Ted proposed was radical ... a complete change from how things were being done in database [sic] ... he turned the problem of data management into one of relations. That dramatically simplified things ... The conventional wisdom was that you should build for the particulars of how the data is stored. He saw that made no sense ... he [moved] the actual manipulation of data away from assembly language programming of the time to higher levels of abstraction that would later become structured query language, or SQL ... He brought principles of encapsulation and abstraction to programming databases, like with a high-level-language in programming."
Quite. Except that Ted was vehemently critical of SQL as a botched concretization of the RDM which, as it turned out, ensured that his ideas would never be truly and fully implemented (one of which, incidentally, was a relational declarative data sublanguage that would replace programming for data management DBMS functions). On the one hand SQL, whatever its flaws, was much superior to the database technologies that preceded it; on the other it has been forever identified with the RDM, to the point where the chance for true RDBMSs was lost (the assembly language statement is not quite accurate -- COBOL, FORTRAN and special purpose languages were used at the time -- assembly language was used for writing access methods at the I/O level, but even that wasn't pure).

Saturday, April 18, 2015

Weekly Update



1. Quote of the Week
To clarify my point further, although M doesn't care about how it's implemented, the implementation has a strong influence on the logical structures that it's trying to implement. In a normalized or demoralized [sic] debate, a fully normalized physical schema is always good, when implemented on an infinite performance hardware. --LinkedIn.com

2. To Laugh or Cry?
I recently attended a presentation on Azure DocumentDB, Microsoft's NoSQL cloud product. I made the following notes:

  • Polyglot persistence: Wasn't this what the RDM was supposed to substitute? 
  • Hierarchy: Didn't we get rid of HDM decades ago?
  • NoSQL: No SQL, but a "SQL-like" language (it's barely relational and now it's used for documents?)
  • No integrity, data independence: Nothing learned from the past.
  • Cloud: At least mainframes were under each company's control.
Progress.

3. Online Debunkings

Comments on "Michael Stonebraker Explains Oracle’s Obsolescence, Facebook’s Enormous Challenge"
4. Interesting Elsewhere
Unskilled and Unaware of It
5. And now for something completely different

View My Stats