Ensure Data Integrity with Cascading DRI
Automate deletes and updates in related tables with minimal effort
May 31, 2002
In SQL Server 2000, Microsoft delivered the long-awaited cascading declarative referential integrity (DRI) support for T-SQL. Cascading DRI lets you cascade deletions or modifications in a table to related rows in another table. For example, when you delete a customer from your Customers table, you might want SQL Server to automatically delete all related orders in your Orders table. To accomplish this goal without cascading DRI, you have to write significant amounts of code. Let's review how the foreign key constraint enforces referential integrity and learn how to reap the benefits of cascading DRI.
Referential Integrity and the Foreign Key
Referential integrity is the enforcement of rules that govern relationships between tables to keep data in your database as consistent as possible. Several mechanisms let you implement referential integrity; these mechanisms are either procedural or declarative. Procedural techniques use procedural code to enforce referential integrity; for example, you can enforce referential integrity by using stored procedures or triggers. Declarative techniques let you implement referential integrity as part of the schema of the tables involved in the relationship (i.e., as part of the table definition).
One of the main DRI enforcement mechanisms that the ANSI SQL-92 standard defines and that SQL Server partially implements is the foreign key constraint. You establish a foreign key constraint between two entities: a primary table (the referenced table) and a secondary table (the referencing table). For example, in the relationship between the Orders and OrderDetails tables that Figure 1 shows, each order in the Orders table has one or more order parts in the OrderDetails table. The Orders table is the primary, or referenced, table, and the OrderDetails table is the secondary, or referencing, table. (Some people might think of the referenced-referencing relationship as a master-detail or parent-child relationship, but I avoid such terminology because it implies a hierarchy.) You place the foreign key constraint on a column or combination of columns in the secondary table—in this case, the orderid column in OrderDetails—and have it point to (or reference) the primary key in the primary table—in this case the orderid column in Orders. You can run the script that Listing 1 shows to create the Orders and OrderDetails tables (without the foreign key) and populate them with some sample data. Later in this article, I use these tables in some examples. Make sure you're not running the script in the Northwind database, which also has an Orders table.
The two columns or combination of columns that you're building the foreign key relationship on can be in the same table. For example, in the Employees table that Figure 2 shows, each employee has a manager who is also an employee. The foreign key constraint is on the mgrid column and points to the empid column. Run the script that Listing 2 shows to create and populate the Employees table (again, without the foreign key).
Essentially, a foreign key constraint verifies that every foreign key value has a related primary key value or is NULL (assuming the column allows NULLs). In the Orders and OrderDetails relationship that Figure 1 shows, every order part in the OrderDetails table must have a related order in the Orders table. Because the orderid column in OrderDetails doesn't allow NULLs, the fact that a foreign key constraint generally allows NULLs in the referencing column is irrelevant. So in the Employees table, each value in the mgrid column must be a valid employee ID or NULL if the employee has no manager (e.g., the employee is the big boss). Foreign keys define relationships that determine what referential actions SQL Server takes. (For more details about foreign keys, see the sidebar "Tips for Using Foreign Keys.")
Referential Actions
Referential actions are actions that SQL Server should perform as a result of a deletion or modification of a row in the referenced table. ANSI defines four referential actions that you can implement by using a foreign key: NO ACTION, CASCADE, SET DEFAULT, and SET NULL. NO ACTION means that SQL Server will reject any type of modification that you issue against a table if the modification breaks the rules that the foreign key enforces. In the Orders and OrderDetails relationship, for example, SQL Server rejects the following modifications:
deleting a row in Orders that has related rows in OrderDetails
updating the orderid of a row in Orders that has related rows in OrderDetails
adding a row to OrderDetails that has an orderid that doesn't exist in Orders
updating the orderid of a row in OrderDetails to an orderid that doesn't exist in Orders
SQL Server 2000 and 7.0 support the NO ACTION option of the foreign key. In fact, NO ACTION is the only type of referential action that a foreign key in SQL Server 7.0 supports. The following code implements a foreign key with NO ACTION in the Orders and OrderDetails relationship in SQL Server 2000 and 7.0:
ALTER TABLE OrderDetails ADD CONSTRAINT FK_OrderDetails_Orders FOREIGN KEY(orderid) REFERENCES Orders(orderid)
NO ACTION is the default referential action in SQL Server 2000, but you can use the code that Listing 3 shows to explicitly specify that you want NO ACTION support for both deletes and updates issued against the primary table. Specifying NO ACTION makes your code clearer and, thus, easier to maintain. After creating the foreign key with or without specifying NO ACTION, test it by issuing the following four illegal modifications and verifying that none succeeds (you'll get a foreign key violation error):
Try to delete an order that has order parts:
DELETE FROM OrdersWHERE orderid = 10001
Try to update an order that has order parts:
UPDATE Orders SET orderid = 10004WHERE orderid = 10002
Try to insert an order part that has no related order:
INSERT INTO OrderDetails VALUES(10005, 11, 12)
Try to update an order part with an orderid that doesn't exist:
UPDATE Orders SET orderid = 10006WHERE orderid = 10003 AND partid = 41
Instead of rejecting primary-table modifications that break the foreign key rules, the CASCADE referential action compensates for them by also modifying the child rows in the secondary table. CASCADE is appropriate when you need to clean up or synchronize child rows after the parent row is removed or modified. For example, when a customer is deleted, you also need to delete customer orders. NO ACTION is appropriate when you want to reject such an attempt. For example, you wouldn't want to delete records of existing book loans when a library's member wants to revoke membership.
You can define ON DELETE CASCADE to cascade a deletion of rows in the primary table to the related rows in the secondary table, and you can define ON UPDATE CASCADE to cascade an update of primary key values to the related foreign key values. In the Orders and OrderDetails relationship, implementing ON DELETE CASCADE and ON UPDATE CASCADE means that when you delete a row in Orders, the related rows in OrderDetails are also deleted. When you update the orderid column in Orders, the orderid values of the related rows in OrderDetails are also updated. However, you still won't be able to insert a row to the secondary table if a related row doesn't exist in the primary table. In other words, CASCADE doesn't compensate for modifications in the secondary table that would render orphaned rows. When you use CASCADE, SQL Server rejects such modifications the same way it rejects them when you use NO ACTION. This means that you can't add a row to OrderDetails with an orderid that doesn't exist in Orders, and you can't update the orderid of a row in OrderDetails to an orderid that doesn't exist in Orders.
The code in Listing 4 implements a foreign key with ON DELETE CASCADE and ON UPDATE CASCADE in the Orders and OrderDetails relationship. Note that if you've already added a foreign key that specifies NO ACTION for the OrderDetails table, you need to use the following statement to drop it before you run the code that Listing 4 shows:
ALTER TABLE OrderDetails DROP CONSTRAINT FK_OrderDetails_Orders
After creating the foreign key with CASCADE support, test it by issuing the same four modifications you used to test the foreign key. After deleting order 10001 from the Orders table, make sure that no rows in OrderDetails have orderid 10001. After changing orderid 10002 to 10004 in Orders, make sure that all rows in OrderDetails that had orderid 10002 now have orderid 10004. Then, make sure both modifications to OrderDetails are rejected and that you get a foreign key violation error for each.
I won't discuss the SET DEFAULT and SET NULL referential actions in detail because SQL Server 2000 doesn't support them through a foreign key constraint and their implementations through other SQL Server mechanisms are outside the scope of this article. But be aware that SET DEFAULT and SET NULL compensate for a modification against the primary table that would otherwise result in orphaned rows by setting the foreign key column values. SET DEFAULT sets the foreign key columns to their default values, and SET NULL sets the values to NULL.
What's Next?
SQL Server doesn't support cascading DRI in releases earlier than SQL Server 2000. To implement cascade actions, programmers and DBAs have to implement other enforcement mechanisms such as triggers or stored procedures, which are more complex to implement and harder to maintain. The advent of cascading DRI in SQL Server 2000 makes the life of SQL Server DBAs and programmers much easier, but the feature has some limitations. In a follow-up article, I'll discuss scenarios in which you need to use stored procedures or triggers to enforce referential integrity.
About the Author
You May Also Like