More often than not, I find the database layer is typically a realm in applications that isn’t emphasized very much. There are certainly developers out there comfortable working with databases, but with the rise of database abstraction layers, query builders, and ORMs, it’s not uncommon to find that database fundamentals are abstracted away, deferred as a lower priority, or pushed aside permanently. One such fundamental that I find commonly ignored all together is referential integrity.
Referential integrity can be considered a practice, implemented through foreign key constraints, that enforces valid data relations in a relational database. Most databases include support for it in some way or another (MySQL’s MyISAM storage engine doesn’t support RI), so the practice and benefits are shared among most relational databases.
Before I show a simple example of how to enforce RI, I’ll provide a few reasons why it’s in your best interest to enforce RI in your applications:
- Faster and more cost efficient development. RI enables you to catch more database related bugs sooner, which not only translates into faster debugging, but also provides the benefit of not having any post bug-fix data refactoring due to accumulated fallout from bugs that may otherwise go unnoticed.
- Higher quality apps – more bugs are caught and fixed sooner, which translates into fewer bugs that reach your users.
- Smarter databases – I’ve always rejected the notion that databases are just a storage mechanic for your application. Databases should be crafted smartly, in a way such that it can look after itself; RI enables that through it’s validation-like behavior. You provide validation for input into your application, why not provide validation for data going into your database?
- Better performance (in some cases). Some databases, SQL Server specifically, have optimizers that use foreign keys to improve read performance when executing queries that touch foreign keyed tables.
So, with those being established, here’s a brief example showing how to enforce RI; a simple user and user type model will be used. These code snippets were executed against a MySQL (5.x) database, if you’re interested in following along.
First, create a user_types tables:
(
user_type_id tinyInt UNSIGNED NOT NULL AUTO_INCREMENT,
name varchar(20) NOT NULL,
constraint PK_user_types PRIMARY KEY (user_type_id)
) engine = InnoDB;
… then populate it with data:
SELECT ‘Admin’ UNION ALL
SELECT ‘Agent’ UNION ALL
SELECT ‘Client’;
Next, create a users table:
(
user_id int UNSIGNED NOT NULL AUTO_INCREMENT,
user_type_id tinyInt UNSIGNED NOT NULL,
firstname varchar(25) NOT NULL,
lastname varchar(25) NOT NULL,
constraint PK_users PRIMARY KEY (user_id),
constraint FK_users_user_type_id FOREIGN KEY (user_type_id) REFERENCES user_types(user_type_id)
) engine = InnoDB;
Before I continue, here’s a quick note on a couple of things from above:
- engine = InnoDB – this is key when implementing referential integrity in MySQL, as the (default) MyISAM storage engine does not support RI.
- constraint FK_users_user_type_id foreign key (user_type_id) references user_types(user_type_id) – this is how we programmatically create our foreign key relationship between users.user_type_id and user_types.user_type_id. The naming convention “FK_users_user_type_id” is optional, I use it to establish that it’s a foreign key constraint on the users.user_type_id column, and that it should reference the user_types.user_type_id column.
With those tables and dummy data established, we can now run a couple of test cases. Insert this row first:
Now try to insert this row:
You should get an error that states something along the lines of “Cannot add or update a child row: a foreign key constraint fails…”. What’s happened is that the database didn’t execute the insert because of invalid data. The problem is that the 2nd insert tried to insert a user_type_id value of 0 in our previous insert statement, but since 0 isn’t a value found in user_types.user_type_id, the database threw an error. In this case the error was intentional, but such errors can easily happen in real world applications and not become obvious until weeks or months worth of data damage have already happened in your application.

