Sunday, February 17, 2013

Forward to the Past: Application-Managed Data Not a Distributed DBMS Make



Sima Ilic: This may be a little unusual ask, but I'd be interested to hear you opinion on Google's evolution of distributed databases use/development: from Megastore to Bigtable to Spanner.

I know that there may be only a handful of companies that need (or have resources to use/develop) such things: Google, Amazon, Facebook? Unfortunately, people talk about it like it's the end of relational DBMS (which is plain nonsense) or the next thing that everybody should be looking at or using (the only word that comes to mind is false, but it's not strong enough for marketing/sales people).

Let me tell you what prompted the question.

Thursday, February 14, 2013

New Normalization Paper Available



Pls see the PAPERS page for the new version, when it becomes available.

Site Update



1.
I will give the keynote address at the Northern California Oracle UG spring conference on Wednesday, May 22 at the CarrAmerica conference center in Pleasanton. I will also present one of my "To Laugh or Cry?" sessions. Full details forthcoming on the SCHEDULE page.

2.
The decision whether to link to a debunked article or not is a difficult one. On my old site I did not. On this site I've reversed the policy, but I have not been comfortable with it. Given that currently googling a title is a very simple and efficient way to find the item, I decided to list the title, but opt for selective linking based on the following criteria:
  • the overall substance must be over a certain threshold
  • not all the content is quoted in my debunking
The new policy was applied first to the last post.

3.
The Quote of the Week was posted on the QUOTES page.

4.
I came across Much Ado About Nothing that does a good job of demonstrating why I consider Hugh Darwen' proposed solution less practical than mine in The Final NULL in the Coffin. In particular, my solution relies entirely on the DBMS, not on users.

5.
A link to a To Laugh or Cry? item was posted on LAUGH/CRY? page.

6.
Links to several exchanges I participated in were posted on the FP ONLINE page.

7.
I am extremely wary of the so very liberal use of the term Data Architect, the inflation of positions so titled and of the professionals who present themselves as such. Do you know of any architect who designs the building, does the engineering blueprints and serves as building contractor?


Sunday, February 10, 2013

Those Who Don't Know the Past ...



It's been long my contention that a core problem of the database management field is poor foundation knowledge, in which I include familiarity with its history. Consider The Rise and Fall of the Third Normal Form. The title signals a rich debunking target. John D. Cook writes:
The ideas for relational databases were worked out in the 1970’s and the first commercial implementations appeared around 1980. By the 1990’s relational databases were the dominant way to store data. There were some non-relational databases in use, but these were not popular. Hierarchical databases seemed quaint, clinging to pre-relational approaches that had been deemed inferior by the march of progress. Object databases just seemed weird.

Thursday, February 7, 2013

Site Update



1.
Links to exchanges I participated in were posted on the FP ONLINE page. In one of them I deplored the major effort invested in mindlessly migrating from fad to fad, rather than on sound productive work. One example:

How to move configurable xml data types and data to Oracle database

2.
A new To Laugh or Cry? item was posted on the LAUGH/CRY? page. Has some relevance to each of the other items mentioned in this update.

3.
The Quote of the Week was posted on the QUOTES page. It is a comment on Iggy Fernandez's blog post, a link to which I posted last week on the FP ONLINE page and which I recommend reading.

4.
The author of Hipsters hacking on PostgreSQL writes that OTOH PostgreSQL was designed to be "a relational counterpart to Oracle and DB2", but it is increasingly being used "not because it's the easiest database to learn and use. It's not ... [or] because it's cool. It's not ... but because it gets stuff done."

I don't know how relational and easy to learn and use it is, but the real issue is are there any non-relational products that are easier to learn and use and get the same things done and if not, why not?

It does not seem to occur to anybody that this might have something to do with whatever relational fidelity its SQL implementation has. And I wonder if Stonebraker, who has lately been pronouncing relational technology obsolete and not up to current needs, and has developed several non-relational products not much heard of, is aware of the irony of his old product's success.

5.
On more than one occasion I criticized the academic substitution of industry fads for scientific research. Instead of leading the industry with science, academics rush to jump on every industry buzzword, a problem which Dijkstra deplored much more intelligently than I can.

Want more evidence? The previous item is one example. Here's another:

Scholarly articles for formal representation of NoSQL

And yet another, better one (detect any irony in the Bio?)

Harnessing Flexible Data in the Cloud

This has an historic precedent: the hierarchic and CODASYL (network) DBMSs were first inferred from existing practices and attempts were post-hoc made to give them a theoretical basis. This effort was subsequently abandoned when it proved too difficult and the result overwhelmingly complex and unusable. Few of today's IT professionals, academics and vendors are aware of this, which is why they are doomed to replicate the past.

6.
In my last update I posted in error on the FP ONLINE page a link to Martijn Evers' blog post instead of the LinkedIn thread that contained my comment on it.

So here is the link to Martijn's post Metadata as a perspective on data and my comment:
I am uncomfortable with the proliferation of concepts and terms at the informal conceptual level that confuse levels of representation, are vague and inconsistent and complexify unnecessarily.

Reality is complex enough without us piling up on it methodological, conceptual and tool complexity--everything should be as simple and parsimonious as possible (but not simpler!). The relational model achieved exactly that at the logical level. The only way to take advantage of it is to reciprocate at the conceptual level. Unnecessary conceptual complexification spills into the logical level and defeats the purpose.


Thursday, January 31, 2013

Site Update



1.
If the industry spent 10% of its time and effort dedicated to deal with NULLs on understanding and implementing a theoretically sound missing data solution, it would have put a stop to the endless stream of exchanges like the one posted on the LAUGH/CRY? page, from which nothing is ever learned.

2.
A link to an online discussion I participated was posted to the FP ONLINE page.

3.
I have long deplored the "magic wand" fad-to-fad modus operandi of the IT industry: (a) an ad-hoc "new paradigm" that accumulates over time prohibitive problems prompts (b) an ad-hoc "new paradigm" to solve them that accumulates over time prohibitive problems which prompts (c) an ad-hoc... You get the idea. This is a systemic and business culture problem that no individual or organization alone can solve.

Consider the new Quote posted on the QUOTES page in this context.

4.
In all the excitement about the Cloud little thought is given to its nefarious implications. Here's just one:
Genetic information stored anonymously in databases doesn't always stay that way, a new study revealed, prompting a debate on how much privacy participants in scientific research can expect in the Internet era.
--Researchers Identify Anonymous DNA Donors, Wall Street Journal.
The Cloud is a form of outsourcing. Letting others do the work seems attractive if one focuses on expediency and upfront dreams of savings in money and effort and ignores the negatives associated with long-term loss of control. In the context of poor foundation knowledge and ad-hoc tools and practices, loss of control will exacerbate those negatives and defeat the initial purpose.
The Boeing Debacle: Seven Lessons Every CEO Must Learn, Forbes.

5.
Poor foundation knowledge can be addressed only by education. [O]nline courses [may be] inevitable, but that is likely to exacerbate the trend of substituting true education--intellectual development--with occupational training, rather than stop and/or reverse it.


View My Stats