Tuesday, September 22, 2015

The Real Data Science: Tables -- So What?



My September post @AllAnalytics. 

We have seen that if database tables are designed to represent a set of (facts about) a single class of attribute-sharing entities each and to preserve the mathematical properties of relations, databases are easier to understand, and query results are guaranteed to be provably correct and easier to interpret. Let's see how and why with the help of an example.

Read it all. (Please comment there, not here)



 



Sunday, September 13, 2015

Weekly Update



The Real Data Science: Tables--So What?

My Presentation to Silicon Valley SQL Server User Group
 

6:30 PM, Tuesday, September 15, 2015

Microsoft
1065 La Avenida, Building 1
Mountain View, CA


Free and open to the public (+ pizza)
For details and RSVP see Meetup
.


1. Quote of the Week
You see, in Cassandra 1.x, the data model is centered around what Cassandra calls “column families”. A column family contains rows, which are identified by a row key. The row key is what you need to fetch the data from the row. The row can then have one or more columns, each of which has a name, value, and timestamp. (A value is also called a “cell”). Cassandra’s data model flexibility comes from the following facts:
* column names are defined per-row
* rows can be “wide” — that is, have hundreds, thousands, or even millions of columns
* columns can be sorted, and ranges of ordered columns can be selected efficiently using “slices”.
--http://blog.parsely.com/post/1928/cass/
Compare this to the RDM.

2. To Laugh or Cry?


3. Online Debunkings


4. Elsewhere


5. And now for something completely different


Sunday, August 30, 2015

Weekly Update



The Real Data Science: Tables--So What?

My Presentation to Silicon Valley SQL Server User Group
 

6:30 PM, Tuesday, September 15, 2015

Microsoft
1065 La Avenida Building 1
Mountain View, CA


Free and open to the public (+ pizza)
For details and RSVP see Meetup


1. Quote of the Week

[Do] formalized languages need the definition of data types? Up to now I have not read strong arguments against my statement that for interpretation and operation on data the use of character strings is sufficient when
  • All data are expressed as character strings that are explicitly based in language communities, whereas the character strings denote concepts that are represented by UID's;
  • The denoted concepts are defined by their supertype concepts (among others);
  • Collections of allowed qualitative concepts (that are denoted by string values or value ranges) are defined to enable the specification of constraints; --LinkedIn.com
2. To Laugh or Cry?


3. Online Debunkings

4. Elsewhere
Which technologies emerge from the abyss
Why Big Data gets it Wrong

5. Housekeeping Added to LINKS page:

  • Query Optimization
  • Relational Algebra
  • LEAP RDBMS
  • Relational
  • Relational Algebra Translator
  • System for Translating Relational Algebra Scripts into Microsoft SQL Server SQL Scripts
  • System for Translating Relational Algebra Scripts into Oracle SQL Scripts
And now for something completely different



Saturday, August 22, 2015

Silicon Valley SQL Server User Group Presentation




The Real Data Science: Tables -- So What? 


During hyping of fads such as "Data Science", all you hear is the "huge opportunities for enterprises to gain hitherto unimagined insights" and very little about the potential to tell enterprises really big lies, which can rise from 100% correct data in poorly designed databases. That's because what passes for "Data Science" is not science, let alone science of data. 

Most data professionals know that relational databases consist of tables, but so what? Provably correct query results are guaranteed by the real data science--the RDM--if and only if tables are well-designed and properly constrained R-tables and the DBMS truly and fully supports it. Unfortunately, more often than not tables are neither, and SQL DBMS's don't, which makes databases harder to understand,  queries don't always make sense and results are hard to interpret, or outright wrong. 

You will learn: 
  • The Real Data Science
  • Relations and databases
  • 5NF R-tables
  • "Table arithmetic"
  • RDM and SQL

6:30 PM, Tuesday, September 15, 2015

Microsoft
1065 La Avenida Building 1
Mountain View, CA
(map)


For details see Meetup.



Sunday, August 16, 2015

Weekly Update



1. Quote of the Week
Later, as use of RDBMS became more widespread, the complexity associated with design of a RDBMS was also well documented ... The associative database model is claimed to offer advantages over RDBMS ... “two fundamental data structures” as “„Items‟ and a set of „Links‟ that connect them together ... Items, which have "a unique identifier, a name and a type” and Links, which have “a unique identifier, together with the unique identifiers of three other things, that represent the source, verb and target of a fact that is recorded about the source in the database ... “each of the three things identified by the source, verb and target may each be either a link or an item.”
--Homan, J. V. and Kovacs, P. J., A Comparison of the Relational Database Model and the Associative Database Model, Issues in Informtion Systems, Vol. X, No. 1, 2009.
2. To Laugh or Cry?
A Comparison of the Relational Database Model and the Associative Database Model
3. Online Debunkings

4. Interesting Elsewhere
Why Domain Expertise is More Important than Algorithms
5. And now for something completely different

View My Stats