Note: "Then & Now" (t&n) is a new version of what used to be the "Oldies but Goodies" (OBG) series. To demonstrate the superiority of a sound theoretical foundation relative to the industry's fad-driven "cookbook" practices, as well as the disregarded evolution/progress of RDM, I am re-visiting my old debunkings, bringing them up to the current state of knowledge. This will enable you to judge how well arguments have held up and realize the increasing gap between industry stagnation -- and scientific progress.
DENORMALIZATION, PERFORMANCE & INTEGRITY
(Email exchange with reader originally published August 2002)
Then ...
“I'd like to comment on your other recent articles: on denormalization. Of course you prove that denormalization does not improve performance, because you pay for it by maintaining integrity. But, when people say that de-normalization improves performance, they usually mean just on one side. For example, I can merge DEPT and EMP tables into a third table DE and achieve a better query performance by replacing a join by a simple select from the new table. If this is the most frequent and most important operation in my application (vs. updates, inserts, deletes), then my overall performance will be improved (and that's what usually happens in DW). But if the opposite is true, then performance will suffer. I didn't see these considerations in your articles ...
Many people, yes, but not nobody. I always considered the cost of denormalization. I know many people in this field that do the same; however, I do agree with you that many people, especially those "younger" ones learning from more "modern" books on database design, especially those in the OO field, are not aware, and what's worse, don't even want to be aware.
That's exactly how I always thought and when I had discussions with people, that's what I always said to them (not that it made a big difference in their thinking). However, when I read your articles on this topic, I had another thought. As you always say (and again, I fully agree with you on this), we must always separate logical and physical. I always considered denormalization as one of the things done at the physical level. So, denormalization shouldn't even be your concern, because it has nothing to do with the relational model. The rule I always follow is that whatever I do at the physical level, it should not destroy my logical model, which must stay normalized. If I denormalize to achieve some performance gains for a selected set of functions, then I do pay for it by writing additional logic to preserve the integrity and by creating views that represent the entities on my logical model, which I had to "destroy". So as long as I separate these two levels, I don't think I'm in any conflict with the relational model. Of course if DBMS gave me more options in physical design while protecting the integrity of my logical model, I wouldn't have to do this myself.
Theoretically, I think the way you do, and that's why I enjoy reading your columns. But I also have to deliver practical results to my users. Unfortunately, I can't go to my users and tell them that their response time is slow because of Oracle's technology. And I don't believe screaming at Oracle will do me any good either (and yes I know what you will say to this). So until that mysterious technology you mentioned many times is implemented, I have to do what I can.”