Note: This is the first part of a two-part re-write of several earlier posts, to bring them into line with McGoveran's formalization and re-interpretation of Codd's true RDM, which includes his own corrections, refinements and extensions[1]. For a more in-depth treatment see the series of papers available here.
“Do I Have to Use Foreign Keys? If I am already manipulating data properly, are foreign keys required? Do they have another purpose that I’m just not aware of? I appreciate the guidance!”
“... [we] wish to make a point. There is something which is bad design/good design/mandatory/optional. Please stop insisting that Primary and Foreign keys are mandatory. They are good design habits but by no means mandatory. However, life is much more complex than a Normalized DB structure. This includes tables serving as event logs; tables, serving as User maintained materialized query tables, tables, serving as supporting structures, reflecting state of complex transactional databases; persistent tables serving as Result Set or Session keepers. And I personally believe that if they were truly mandatory, Sybase, Oracle, SQL Server, Ingres, DB2, etc. would require them. Oh, sorry, forgot the SQL standard itself. This is not the relational model we're talking about. These are commercially available RDBMSs which, not surprisingly, DO tend to listen to their customers. If they didn't, they wouldn't be in business!! Since Sybrand is unlikely to get FKs required by the SQL standard or the major RDBMS vendors, it seems that mandatory means that his answer to the question "Do I have to use foreign keys?" is "You would if you worked in my shop!". I'm inclined to agree with that.”
“Databases can work with or without primary keys and foreign keys. The choice is yours... However ... enforcing referential integrity can be done by many methods ... TMTOWTDT = There is more than one way to do this ... It all depends on your approach... In the last ten years... every one is enforcing referential integrity with help of primary and foreign keys but before this ... a lot of applications were working without primary and foreign keys to enforce referential integrity and to avoid orphaned rows/avoid duplicate records.”
“We don't have every possible logical relationship enforced by the database. Sometimes you have to compromise for performance reasons, as too many foreign key validations can slow down high volume inserts. Other times you have to create breakpoints just to keep the web of relationships from becoming too tangled and connecting hundreds or thousands of tables.”
“I think it is preferable to have FK constraints as an additional security layer and they can be disable[d] during loading if required; however, you need to be 100% certain your ETL is enforcing the constraints. It is best to do both - have the ETL reject records which fail FK checks and report on these whilst also enforcing FK intergrity on insert/update, if appropriate. The only additional thing I can add is - when you delete from a FK enabled DB, make sure you do it in the correct order.”
“I think, you have to learn about data structures and logical data design (not only database, which is nowadays interpreted mainly as only RDBMS), to be clear about usage primary, alternate, and foreign keys, normal forms, data integrity-and database integrity, because your database will work suboptimally without these knowledge if it will work at all.”Data practitioners have high levels of tool knowledge, but lack a good grasp of fundamentals, for which reason they cannot be considered data professionals. Now, do not get me wrong: I do not mean that good knowledge of tools is unimportant -- if you work with them you gotta know them real well -- but the ability to fully assess them, use them optimally, and compensate for any shortcomings is limited in the absence of foundation knowledge. So let's have some before tackling the exchange.