Avoid Referential Integrity Errors When Deleting Records from Databases
Let this script or stored procedure do the work for you
May 17, 2009
UPDATE: Find the updated version of this article with new information on the PR_HIERARCHIAL_DATA procedure: "Revisiting How to Avoid Referential Integrity Errors"
At the company in which I work, we sometimes need to release a development database to production that contains only the schema and no data. One of the most common practices that most of the DBAs follow is to script the schema and release it to production. However, sometimes we receive requests that the build should be as a SQL Server backup because they want to retain the diagrams in production and they want to reduce any problems that might come while executing the scripts.The fastest and easiest way I found to do this is to back up the development database, create a new database from the backup, then use a script to delete all the records from the newly created database, leaving only the schema. We then take the backup and release to production.
When you delete a database's records, you need to be careful about referential integrity errors. These errors will occur if you delete the records in tables with foreign keys before deleting the records in the referenced tables. To avoid this problem, I created a script, DELETE_RECORDS.sql, that disables all the foreign keys in the database, deletes all the records, then reenables all the foreign keys. You can use this generic script to delete all the records from any database. It uses the ALTER TABLE statement to disable and reenable the foreign keys, so ALTER TABLE permission for the database is needed to use the script. Without this permission, you'll get a foreign key violation error.
At our company, some database owners (DBOs) occasionally need to delete all the data from SQL Server Integration Services (SSIS) packages and refresh them with new data. A fellow DBA recently asked me whether there's a way to delete all the records in a database without disabling the foreign keys because giving ALTER TABLE permissions to DBO user accounts isn't advisable. So, I created a stored procedure, PR_HIERARCHIAL_DATA, that deletes all the records in a database without disabling the foreign keys.
To avoid referential integrity errors, PR_HIERARCHIAL_DATA determines the hierarchical order of all the tables in a database, then deletes the data from those tables in reverse hierarchical order. Identifying the order is bit difficult because many types of relationships can exist in a database, including one-to-one, one-to-many, and many-to-many relationships.
To identify the hierarchical order, the stored procedure uses information in the sysseferences systems table and sys.objects system view. In sysseferences, it uses the fkeyid column, which contains the IDs of the referencing tables (i.e., the tables that contain foreign keys), and the rkeyid column, which contains the IDs of the referenced tables. In sys.objects, it uses the object_id column, which contains object IDs.
PR_HIERARCHIAL_DATA uses a recursive common table expression (CTE) query to join the sysreferences table multiple times based on a join between the rkeyid and fkeyid columns. I prefer using CTEs because they offer a more readable form of the derived table that can be declared once and referenced multiple times in a query. This CTE query results in a hierarchical list that contains the tables with foreign keys and the tables those foreign keys are referencing. However, the list doesn't include tables that don’t have any relationships (i.e., independent entities). Another CTE query provides that information. The results from both CTE queries are put into a temporary table. Duplicate table names (which result when a table references to two or more tables) are removed with the MAX function.
The stored procedure then uses the information in the temporary table to delete the data in the tables in an order that avoids referential integrity errors. If PR_HIERARCHIAL_DATA finds conflicted tables (i.e., tables that reference each other), it will print the names of those tables. You will then need to manually delete the data from those tables.
You can use the PR_HIERARCHIAL_DATA stored procedure three ways. To simply display a hierarchical list of the tables with foreign keys and the tables those foreign keys are referencing, you execute the stored procedure without passing in any parameter values, as in
EXEC PR_HIERARCHIAL_DATA
If you want to output a list of the tables in hierarchical sequence so you can design a SSIS package or write an insert script that will load data into the tables, you use the INSERT parameter when you call the stored procedure, as in
EXEC PR_HIERARCHIAL_DATA 'INSERT'
If you want to delete the data from the database tables, you use the DELETE parameter, as in
EXEC PR_HIERARCHIAL_DATA 'DELETE'
You can download the PR_HIERARCHIAL_DATA stored procedure and DELETE_RECORDS.sql script by clicking the Download the Code icon at the top of the page. The stored procedure and script were written for SQL Server 2005. You don't need to customize any code before running them.
About the Author
You May Also Like