Tuesday, January 31, 2017

Outsmarting the DBMS: Analysts Should Beware



Revised 5/4/2020.


Last month I alerted you to the failure by data professionals to appreciate the importance, for a variety of critical reasons, of reliance on the DBMS rather than application code for integrity enforcement and data manipulation. The following is an example of the consequences: 
"If you have multiple boolean fields in a record, consider combining them into a single Integer field. For instance in a User record create a single UserType field instead of 6 separate field for IsTrainee, IsManager, IsTrainer, IsHR, IsSupplier, IsSupport. By assigning 1,2,4,8 and 16, 32 as "yes" values for these then we can say that a value of 3 in this UserType field tell us that they are both Trainee and a Manager; 36 that they are the Trainer, and they are responsible for Support. The advantage of combining these into one field is that is another type can be added (e.g., IsFirstAider=64) without adding a field."
Note: "File, "record," and "field" are physical implementation concepts. The logical design concepts are relation (visualizable as R-table), tuple (visualizable as row) and attribute (visualizable as column). By using the proper terms there is less likelihood of confusion of levels of representation rampant in the industry, which has deleterious consequences[1].

Sunday, January 29, 2017

This Week



1. What's Wrong with This Database Picture?

"I have a database for a school ... [with] are numerous tables obviously but consider these:
CONTACT - all contacts (students, faculty) has fields such as LAST, FIRST, MI, ADDR, CITY, STATE, ZIP, EMAIL;
FACULTY - hire info, login/password for electronic timesheet login, foreign key to CONTACT;
STUDENT - medical comments, current grade, foreign key to CONTACT.
Do you think it is a good idea to have a single table hold such info? Or, would you have had the tables FACULTY and STUDENT store LAST, FIRST, ADDR and other fields? At what point do you denormalize for the sake of being more practical?What would you do when you want to close out one year and start a new year? If you had stand-alone student and faculty tables then you could archive them easily, have a school semester and year attached to them. However, as you go from one year to the next information about a student or faculty may change. Like their address and phone for example. The database model now is not very good because it doesn’t maintain a history. If Student A was in school last year as well but lived somewhere else would you have 2 contact rows? 2 student rows?  Or do you have just one of each and have a change log. Which is best?" --comp.databases.theory

Sunday, January 22, 2017

Are You a Thinking Data Professional?



Note: The following was intended as a comment to my post Don't Design Databases without Foundation Knowledge and Conceptual Models  by Todd Everett. He is a reader I deem a "thinking data professional" -- always the qualitative rather than quantitative target of my writings and teachings. It merits to be a post in its own right to benefit others.

Monday, January 16, 2017

Don't Design Databases Without Foundation Knowledge and Conceptual Models




"I have two tables, one is product which is a parent table with one primary key and I have another child table of product, which is a product_details table. But the child table is linking with parent table(product) with logical data instead of foreign key,as we are doing this relationship with the help of java code in the coding side, instead of depending on the data base, which make it as tight couple. To avoid tight coupling between the tables we are storing the primary key value in the child table.
CREATE TABLE `tbl_product` (
  `product_id` varchar(200) NOT NULL,
  `product_details_id` varchar(200) DEFAULT NULL,
  `currency` varchar(20) DEFAULT NULL,
  `lead_time` varchar(20) DEFAULT NULL,
  `brand_id` varchar(20) DEFAULT NULL,
  `manufacturer_id` varchar(150) DEFAULT NULL,
  `category_id` varchar(200) DEFAULT NULL,
  `units` varchar(20) DEFAULT NULL,
  `transit_time` varchar(20) DEFAULT NULL,
  `delivery_terms` varchar(20) DEFAULT NULL,
  `payment_terms` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`product_id`));

CREATE TABLE `tbl_product_details` (
  `product_details_id` varchar(200) NOT NULL,
  `product_id` varchar(200) DEFAULT NULL,
  `product_name` varchar(50) DEFAULT NULL,
  `landingPageImage` varchar(100) DEFAULT NULL,
  `product_description_brief` text CHARACTER SET latin1,
  `product_description_short` text CHARACTER SET latin1,
  `product_price_range` varchar(50) DEFAULT NULL,
  `product_discount_price` varchar(20) DEFAULT NULL,
  `production_Type` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`product_details_id`),
  UNIQUE KEY `product_id` (`product_id`));
Please suggest the Pros and Cons of the design, we are following this kind of relationship in my company, as the manager is saying it will give [us flexibility]. I know that if we lose the data from the table, we can't know the relationship between the two tables."--StackExchange.com

Monday, January 9, 2017

This Week



1. What's wrong with this picture
"I have two tables, one is product which is a parent table with one primary key and i do have another child table of product, which is a product_details table. But the child table is linking with parent table(product) with logical data instead of foreign key,as we are doing this relationship with the help of java code in the coding side, instead of depending on the data base, which make it as tight couple. To avoid tight coupling between the tables we are storing the primary key value in the child table.

CREATE TABLE `tbl_product` (
 `product_id` varchar(200) NOT NULL,
 `product_details_id` varchar(200) DEFAULT NULL,
 `currency` varchar(20) DEFAULT NULL,
 `lead_time` varchar(20) DEFAULT NULL,
 `brand_id` varchar(20) DEFAULT NULL,
 `manufacturer_id` varchar(150) DEFAULT NULL,
 `category_id` varchar(200) DEFAULT NULL,
 `units` varchar(20) DEFAULT NULL,
 `transit_time` varchar(20) DEFAULT NULL,
 `delivery_terms` varchar(20) DEFAULT NULL,
 `payment_terms` varchar(20) DEFAULT NULL,
 PRIMARY KEY (`product_id`));

CREATE TABLE `tbl_product_details` (
 `product_details_id` varchar(200) NOT NULL,
 `product_id` varchar(200) DEFAULT NULL,
 `product_name` varchar(50) DEFAULT NULL,
 `landingPageImage` varchar(100) DEFAULT NULL,
 `product_description_brief` text CHARACTER SET latin1,
 `product_description_short` text CHARACTER SET latin1,
 `product_price_range` varchar(50) DEFAULT NULL,
 `product_discount_price` varchar(20) DEFAULT NULL,
 `production_Type` varchar(20) DEFAULT NULL,
 PRIMARY KEY (`product_details_id`),
 UNIQUE KEY `product_id` (`product_id`));
Please suggest the Pros and Cons of the design, we are following this kind of relationship in my company, as the manager is saying it will give us flexible to us. I know that if we lose the data from the table, we can't know the relationship between the two tables."--StackExchange.com

Tuesday, January 3, 2017

Understanding the Relational Data Model: A New Series of Papers



"Nowadays, anyone who wishes to combat lies and ignorance and to write the truth must overcome at least five difficulties. He must have:
  1. The keenness to recognize it, although it is everywhere concealed;
  2. The courage to write the truth when truth is everywhere opposed;
  3. The skill to manipulate it as a weapon;
  4. The judgement to select in whose hands it will be effective, and
  5. The cunning to spread the truth among such persons."
--Berthold Brecht
A rather accurate explanation of why it has been so difficult to dispel the misuse and abuse of the Relational Data Model since inception. To the point that most of its core practical benefits have failed to materialize, with the IT industry regressing all the way back to its pre-relational and even pre-database state:
  • Graph DBMSs;
  • XML;
  • JSON;
  • NoSQL;
  • Application-specific databases and DBMSs;
  • "Unstructured data";
  • No integrity enforcement;
  • A cacophony of imperative programming languages rather than declarative data sublanguages (suffixed with QL, just like old non-relational DBMSs were with /R). 

Saturday, December 24, 2016

This Week with Season's Greetings





Data Sublanguages, Programming and Data Integrity

My December post @All Analytics

Both data science employers and candidates stress the eclectic nature of the required skills, programming in particular. Indeed, coding has acquired such an elevated role, that it now entirely replaces education. Aside from the societal destructive consequences of this trend, in the context of data management it is a regressive self-fulfilling prophecy that obscures and disregards the core practical objective of database management to minimize programming. You can frequently encounter it in comments like:
"Anything you can model in a DBMS you can model in Java. The next paradigm shift is business rules centralized in Java business objects, rather than hard-coded in SQL for better manageability, scalability, etc. The only ones that should reside in a database are referential integrity (and sometimes even that isn't really necessary). Don't let pushy DBAs tell you otherwise -- integrity constraints slow down development as well as performance."
Upside down and backwards.

Read it all (and comment there, not here, please).


THE DBDEBUNK GUIDE TO MISCONCEPTIONS OF DATA FUNDAMENTALS available to order here.

("What's Wrong with this Picture" will return in 2017)


1. Quote of the Week

"The value of the model may be diminishing in certain enterprises, since busy with deliverables." --Harshendu Desai, LinkedIn.com

3. To Laugh or Cry?

5 Reasons Relational Databases Hold Back Your Business

4. Added to the LINKS page

  • What a Database Really Is: Predicates and Propositions
  • The Logical Fallacies

5. Of Interest


And now for something completely different


New at The PostWest (check it out)

My take of the week

Choosing not to veto, Obama lets anti-settlement resolution pass at UN Security Council
The press refused to publish Obama's Chicago speech to the Palestinian lobby to hide his anti-semitism. He was never as troubled by Assad, or Putin, or Erdogan as he was by Netanyahu.  That's because Jews have always been a soft target (Barak Obama's Israeli Settlements Canard).  If that is not anti-semitism, I don't know what is.
When the US is in the same camp with Russia, China, Iran and Turkey and her acts are cheered by Hamas, Islamic Jihad and Hezbollah, she has sold out and moved to the dark side.

America (like most other countries) is occupied Indian land via atrocities (and not by people who returned to their own country, like the Jews did). So when America returns its settlements to the Indians, Israel will return its "settlements" (which Israelis got when they defended themselves from "being thrown into the sea"). Until then moralizing and selling out Israel to genocidal terrorists is hypocritical anti-semitism, just like everyone else's (see below).


Global Hypocritical Anti-semitism

UN

 
US

EU

Article of the week

Israel and the Occupation Myth

Video of the week
The Red Disaster. The "life" in Romania during the 60s. The Jews did the worst due to deep anti-semitism. America paid Ceausescu to get us out, but neither she nor Europe wanted us. Had there been no Israel, we would have probably starved to death, not necessarily in a rotten jail. Nobody talks about us, or the hundreds of thousands of the Jewish refugees kicked out from the Arab countries, none of whom were murderous, but everybody is obsessed with the suffering of the Palestinians, who are genocidal.

Pinch-me of the week

Ahmad Tibi urges Israelis not to ‘live by the sword’. As if she is allowed to live without it.

Book of the week (Purchase via this link to support the site)
Bard, M., MYTHS AND FACTS: A GUIDE TO THE ARAB-ISRAELI CONFLICT

Note: I will not publish or respond to anonymous comments. If you want to say something, stand behind it. Otherwise don't bother, it'll be ignored.
View My Stats