Follow @DBDebunk
Follow @ThePostWest
My October column @AllAnalytics.
We have seen that the usefulness of the relational data model (RDM) is
in its dual theoretical foundation -- first order predicate logic (FOPL)
and set theory -- the mathematics guarantees provable logical
correctness of query results regardless what meaning is assigned to the
database R-tables. But a logically correct result is not necessarily a meaningful
result. As a reader commented "If a is Salary and b is Start date, the
[Cartesian] product a*b still doesn't have a sensible meaning." It is
critical for the analyst to understand the distinction.
Read it all. (Please comment there, not here)
Wednesday, October 21, 2015
Sunday, October 18, 2015
Sunday, October 11, 2015
Weekly Update (UPDATED)
Follow @DBDebunk
Follow @ThePostWest
Housekeeping: Added following to LINKS page:
1. Quote of the Week
Housekeeping: Added following to LINKS page:
- Why is it so hard to get SQL performance right the first time
- Why Are There No Relational DBMS's
- SchemeUnit and SchemeQL: Two Little Languages
1. Quote of the Week
"The real definition of Big Data?? Simple: Whatever does not fit in Excel!"
Me: What is--precisely, pls!--the threshold from small to big data? And how do the structural, manipulative and integrity aspects change over the threshold?
HM: Big data := the smallest set of data for which the sensitivity of your transfer function is minimal, and such that the cardinality of this set is too large to implement said transfer function on a single physical machine.Transfer function := though not strictly a function, as in the Lambda calculus, it is more of an operator which ingests data of any size and produces a monetizeable product or service. Sensitivity := the degree to which a perturbation on the input into a transfer function affects the result produced by said function.
Me: Ugh! I'm sure that's exactly what all the overnight data scientists, including those who invented Big Data, had in mind.--LinkedIn.com
Sunday, October 4, 2015
Database Education: Oughts and OughtNot's
Follow @DBDebunk
Follow @ThePostWest
From an online exchange in response to A Tiny Intro to Database Systems:
From an online exchange in response to A Tiny Intro to Database Systems:
C: As a non-CS grad coming fresh to databases, I found both the entity-relationship, and the object-oriented models confusing. Then I read Date [1] and Codd's [2] books and papers on the relational model, the one from the 1970s that is basically set and type theory applied to data, and found that to be a lot clearer and a more powerful abstraction to deal with your data model. As a non-"full time developer" it amazes me the number of "experienced" developers who are not aware of the relational model and who do not know what a foreign key is, or why referential integrity might be important.
For example, your Relational Model introduction has a discussion of various data types. But arguably, whether your integer is implemented as BIGINT or TINYINT is an implementation decision which should be separate from the model discussion (dixit Date). In other words, that attribute has a type of integer and how that integer is stored is a separate issue, and your RDBMS ought to abstract it away (as, I think, Postgres is pretty good with, and MySQL quite annoying). The beauty of the latest RDBMS developments, particularly in PostgreSQL world, is that the implementation has gotten so good that you don't need to really worry about it like you used to just a decade ago, at least in 95% of use cases.
I think one can teach SQL (and the relational model) to a non-developer in about 2 hours, because it is so declarative and intuitive. One day I'll go write that tutorial, as many clients need it sorely.
Sunday, September 27, 2015
Weekly Update
Follow @DBDebunk
Follow @ThePostWest
UPDATE: I have posted, via David McGoveran, an update to last week's post on Codd's 12 rules.
Reactions to my presentation "The Real Science: Tables- So What?" to the Silicon Valley SQL Server User Group.
With regards to Language Redundancy and DBMS Performance: A SQL Story:
1. Quote of the Week
UPDATE: I have posted, via David McGoveran, an update to last week's post on Codd's 12 rules.
Reactions to my presentation "The Real Science: Tables- So What?" to the Silicon Valley SQL Server User Group.
With regards to Language Redundancy and DBMS Performance: A SQL Story:
- Find the Supplier number for those suppliers who supply every part
- Why is it so hard to get SQL performance right the first time
1. Quote of the Week
... the challenges inherent in the SQL RDBMS [sic] approach ... the constrained schema (or schema-first) approach of SQL RDBMS engines imposes semantic infidelity rather than fidelity on all applications and services that depend on this RDBMS type, solely ... SQL RDBMS engines (as per what I've outlined above) do impose a "one size fits all" constraint on DBMS driven apps and services that manifests as the "data variety issue" outlined by the "Big Data" meme.
--LinkedIn.com
Tuesday, September 22, 2015
The Real Data Science: Tables -- So What?
Follow @DBDebunk
Follow @ThePostWest
My September post @AllAnalytics.
We have seen that if database tables are designed to represent a set of (facts about) a single class of attribute-sharing entities each and to preserve the mathematical properties of relations, databases are easier to understand, and query results are guaranteed to be provably correct and easier to interpret. Let's see how and why with the help of an example.
Read it all. (Please comment there, not here)
My September post @AllAnalytics.
We have seen that if database tables are designed to represent a set of (facts about) a single class of attribute-sharing entities each and to preserve the mathematical properties of relations, databases are easier to understand, and query results are guaranteed to be provably correct and easier to interpret. Let's see how and why with the help of an example.
Read it all. (Please comment there, not here)
Sunday, September 20, 2015
Subscribe to:
Posts (Atom)