Sunday, October 4, 2015

Database Education: Oughts and OughtNot's



From an online exchange in response to A Tiny Intro to Database Systems:
C: As a non-CS grad coming fresh to databases, I found both the entity-relationship, and the object-oriented models confusing. Then I read Date [1] and Codd's [2] books and papers on the relational model, the one from the 1970s that is basically set and type theory applied to data, and found that to be a lot clearer and a more powerful abstraction to deal with your data model. As a non-"full time developer" it amazes me the number of "experienced" developers who are not aware of the relational model and who do not know what a foreign key is, or why referential integrity might be important.

For example, your Relational Model introduction has a discussion of various data types. But arguably, whether your integer is implemented as BIGINT or TINYINT is an implementation decision which should be separate from the model discussion (dixit Date). In other words, that attribute has a type of integer and how that integer is stored is a separate issue, and your RDBMS ought to abstract it away (as, I think, Postgres is pretty good with, and MySQL quite annoying). The beauty of the latest RDBMS developments, particularly in PostgreSQL world, is that the implementation has gotten so good that you don't need to really worry about it like you used to just a decade ago, at least in 95% of use cases.

I think one can teach SQL (and the relational model) to a non-developer in about 2 hours, because it is so declarative and intuitive. One day I'll go write that tutorial, as many clients need it sorely.

Sunday, September 27, 2015

Weekly Update



UPDATE: I have posted, via David McGoveran, an update to last week's post on Codd's 12 rules.

Reactions to my presentation "The Real Science: Tables- So What?" to the Silicon Valley SQL Server User Group. 

With regards to Language Redundancy and DBMS Performance: A SQL Story:

1. Quote of the Week

... the challenges inherent in the SQL RDBMS [sic] approach ... the constrained schema (or schema-first) approach of SQL RDBMS engines imposes semantic infidelity rather than fidelity on all applications and services that depend on this RDBMS type, solely ... SQL RDBMS engines (as per what I've outlined above) do impose a "one size fits all" constraint on DBMS driven apps and services that manifests as the "data variety issue" outlined by the "Big Data" meme.
--LinkedIn.com

Tuesday, September 22, 2015

The Real Data Science: Tables -- So What?



My September post @AllAnalytics. 

We have seen that if database tables are designed to represent a set of (facts about) a single class of attribute-sharing entities each and to preserve the mathematical properties of relations, databases are easier to understand, and query results are guaranteed to be provably correct and easier to interpret. Let's see how and why with the help of an example.

Read it all. (Please comment there, not here)



 



Sunday, September 13, 2015

Weekly Update



The Real Data Science: Tables--So What?

My Presentation to Silicon Valley SQL Server User Group
 

6:30 PM, Tuesday, September 15, 2015

Microsoft
1065 La Avenida, Building 1
Mountain View, CA


Free and open to the public (+ pizza)
For details and RSVP see Meetup
.


1. Quote of the Week
You see, in Cassandra 1.x, the data model is centered around what Cassandra calls “column families”. A column family contains rows, which are identified by a row key. The row key is what you need to fetch the data from the row. The row can then have one or more columns, each of which has a name, value, and timestamp. (A value is also called a “cell”). Cassandra’s data model flexibility comes from the following facts:
* column names are defined per-row
* rows can be “wide” — that is, have hundreds, thousands, or even millions of columns
* columns can be sorted, and ranges of ordered columns can be selected efficiently using “slices”.
--http://blog.parsely.com/post/1928/cass/
Compare this to the RDM.

2. To Laugh or Cry?


3. Online Debunkings


4. Elsewhere


5. And now for something completely different


Sunday, August 30, 2015

Weekly Update



The Real Data Science: Tables--So What?

My Presentation to Silicon Valley SQL Server User Group
 

6:30 PM, Tuesday, September 15, 2015

Microsoft
1065 La Avenida Building 1
Mountain View, CA


Free and open to the public (+ pizza)
For details and RSVP see Meetup


1. Quote of the Week

[Do] formalized languages need the definition of data types? Up to now I have not read strong arguments against my statement that for interpretation and operation on data the use of character strings is sufficient when
  • All data are expressed as character strings that are explicitly based in language communities, whereas the character strings denote concepts that are represented by UID's;
  • The denoted concepts are defined by their supertype concepts (among others);
  • Collections of allowed qualitative concepts (that are denoted by string values or value ranges) are defined to enable the specification of constraints; --LinkedIn.com
2. To Laugh or Cry?


3. Online Debunkings

4. Elsewhere
Which technologies emerge from the abyss
Why Big Data gets it Wrong

5. Housekeeping Added to LINKS page:

  • Query Optimization
  • Relational Algebra
  • LEAP RDBMS
  • Relational
  • Relational Algebra Translator
  • System for Translating Relational Algebra Scripts into Microsoft SQL Server SQL Scripts
  • System for Translating Relational Algebra Scripts into Oracle SQL Scripts
And now for something completely different



View My Stats