Tracking Changes in SQL Server 2012

Two features let you easily track table changes

hand with pen tracking red and green lines on graph

Over the years, I created several custom solutions to help identify Data Manipulation Language (DML) changes and track historical changes. While effective, these solutions were sometimes cumbersome to use and maintain. SQL Server 2012 eliminates the need for such custom solutions, making life easier for everyone. In SQL Server 2012, there are two features that can help you track changes the painless way: change data capture and change tracking.

For the most part, the change data capture and change tracking features are interchangeable, with one exception. Both features utilize a synchronous tracking mechanism, which means SQL Server is minimally impacted. Table 1 shows what can be tracked with the two features. As you can see, the main difference between them is historical data tracking. Only the change data capture feature lets you track the actual data that was changed.

Item Tracked

Table 1: What the Features Can Track

DML changes

Historical data

Column changes

DML type

The Logistics

Before you can use the change data capture and change tracking features, you need to do the following.

Execute sys.sp_cdc_enable_db. Although the tracking will be taking place at the table level, you need to make the database aware of this capability. You can do so by executing the sys.sp_cdc_enable_db system stored procedure, as shown below:

EXEC sys.sp_cdc_enable_db;

Execute sys.sp_cdc_enable_table. You need to make the table you want to track aware of the tracking by executing the sys.sp_cdc_enable_table system stored procedure. Here's its syntax:

EXEC sys.sp_cdc_enable_table  [ @source_schema = ] 'source_schema',  [ @source_name = ] 'source_name',  [ @role_name = ] 'role_name'    -- Specify the gating role if gating is desired    -- or set to NULL if you don't want gating.  [,[ @capture_instance = ] 'capture_instance'    -- Name instance-specific changes.  [,[ @supports_net_changes = ] supports_net_changes ]  [,[ @index_name = ] 'index_name' | NULL ]  [,[ @captured_column_list = ] 'captured_column_list' |    NULL ]    -- Use NULL if you want all columns captured.  [,[ @filegroup_name = ] 'filegroup_name' ]  [,[ @allow_partition_switch = ] 'allow_partition_switch' ]

As you can see, you need to provide information such as the source table's schema and name, who has permission to access the change data, and the gating role member if gating is desired. For more information about the syntax, see "sys.sp_cdc_enable_table (Transact-SQL)" in SQL Server 2012 Books Online (BOL).

After tracking is enabled on the table and database, information about any changes made to that table is written to system tables in that database. The tables will begin with the schema name cdc. In addition, the dbo.systranschemas table is automatically created to track replication schema changes.

Let's Get Started

To introduce you to the types of information that you can capture, I wrote a script that:

  • Creates a database named Learn_CDC.

  • Creates and populates a table named MyCDCPlay.

  • Enables the change data capture feature on the Learn_CDC database.

  • Enables the change data capture feature on the MyCDCPlay table.

  • Adds data to the MyCDCPlay table, which will be tracked by the change data capture feature.

Listing 1 contains this script, which you can download by clicking the Download the Code button near the top of the page.

CREATE DATABASE Learn_CDCGOUSE Learn_CDCGOCREATE TABLE MyCDCPlay (ID INT IDENTITY(1,1) PRIMARY KEY,  cdccol1 VARCHAR(10) NOT NULL,  cdccol2 CHAR(3) SPARSE NULL,  cdccol3  VARCHAR(20) SPARSE NULL);GOINSERT INTO MyCDCPlay (cdccol1,cdccol2, cdccol3)VALUES ('CDC1Play', 'YES', 'Some cool stuff')GOINSERT INTO MyCDCPlay (cdccol1)VALUES ('CDC2Play')GO-- Make sure you have data.SELECT * FROM MyCDCPlay;-- Enable CDC on database.USE Learn_CDCGOEXECUTE sys.sp_cdc_enable_db;GO-- Enable CDC on table. Make sure that-- SQL Server Agent is running.USE Learn_CDC;GOEXECUTE sys.sp_cdc_enable_table  @source_schema = N'dbo'  , @source_name = N'MyCDCPlay'  , @role_name = NULLGO-- Insert more data, which will be tracked.INSERT INTO MyCDCPlay (cdccol1,cdccol2, cdccol3)VALUES ('CDC4Play', 'No', 'Some cool stuff')GOINSERT INTO MyCDCPlay (cdccol1)VALUES ('CDC5Play')GO-- Make sure you have data.SELECT * FROM MyCDCPlay;

After you run the code in Listing 1, you'll see a folder structure similar to that shown in Figure 1.

Figure 1: Reviewing the System Tables Created by the Change Data Capture Feature

How to Access the Information About the Tracked Changes

As I mentioned previously, the script added data to the MyCDCPlay table after the change data capture feature was enabled. To see the information that was captured by the change data capture feature, you can query the system tables created by the feature. Note that SQL Server 2012 BOL recommends that you don't query system tables directly. Table 2 provides the stored procedures and functions that SQL Server 2012 BOL recommends you use to query these system tables.

System Table

Table 2: Recommended Ways to Query the cdc System Tables

cdc.captured_columns

cdc.change_tables

cdc._CT

cdc.ddl_history

cdc.index_columns

cdc.lsn_time_mapping

To start, you can query the cdc.captured_columns table to see the information about the columns that were tracked. Figure 2 shows the results of this query.

Figure 2: Learning About the Columns That Were Tracked

Querying the cdc.change_tables table gives you information about the table. Figure 3 shows the results of this query. Note that the capture_instance parameter wasn't specified when the table was enabled with sys.sp_cdc_enable_table, so the table uses the default schema and the default table name.

Figure 3: Learning About the Table That Was Tracked

Querying the cdc.dbo_MyCDCPlay_CT table gives you information about the data changes. As Figure 4 shows, this is the information that was specified during the two INSERT operations.

Figure 4: Learning About the Data That Was Changed

At this point, let's update the data in the MyCDCPlay table by running the command:

UPDATE MyCDCPlay SET cdccol2 = 'be'  WHERE cdccol1 = 'CDC4Play'

Figure 5 shows the results. Notice that the results include both the historical data and new data. You can collate them to know exactly what changed.

Figure 5: Updating the Data in the Table

Querying the cdc.ddl_history table gives you information about any schema changes. However, in this case, the script didn't make any schema changes, so you wouldn't get any results. Let's make some schema changes to see what happens:

ALTER TABLE MyCDCPlay ADD cdccol5 varchar(30) Sparse NULLALTER TABLE MyCDCPlay ALTER COLUMN cdccol1 VARCHAR(50)  NOT NULL

If you now query the cdc.ddl_history table, you'll get results like that in Figure 6. As you can see, the schema changes are captured, including when the changes occurred.

Figure 6: Learning About the Schema Changes

The last three tables you can query are:

  • cdc.index_columns. Querying the cdc.index_columns table provides information about the indexed columns associated with the table that was changed.

  • cdc.lsn_time_mapping. Querying the cdc.lsn_time_mapping table provides information about the time mapping for the changes.

  • dbo.systranschemas. Querying the dbo.systranschemas table shows you the schema changes and the type of schema that was changed.

The Downside

There are a few shortcomings with the change data capture and change tracking features. Here are some shortcomings that I've come across:

  • The features provide a table-by-table solution. If you have an environment with hundreds tables, it would be cumbersome to implement. This usually pushes people to use third-party tools.

  • Information about the user who made the change isn't captured. Microsoft recommends the use of SQL Server's audit features for this capability.

  • Synchronous operations contribute to slow data commits.

  • Changes to computed columns aren't tracked. In addition, there are limitations when tracking changes in columns containing XML, Sparse, Timestamp, and BLOB data types. For more information about these limitations, see "Track Data Changes (SQL Server)" in SQL Server 2012 BOL.

Document Changes and More

The change data capture and change tracking features let you easily track changes to your tables. You can use these features to not only document changes but also create back-end systems for applications. For example, you could use the features as part of a parcel tracking system that keeps track of where parcels are located at any given point during the shipping process. You could even include a mechanism to evaluate the system and determine where efficiency can be improved.

*************************************************************************************

Leonard MwangiLeonard Mwangi is a design architect with Alexander Open Systems, concentrating mostly on information worker solutions and applications integration. Leonard has more than 12 years of experience in the IT industry, primarily focusing on SQL Server and SharePoint.
Twitter: @lmundia
LinkedIn: http://www.linkedin.com/in/lmundia
Blog: http://www.geekswithblogs.net/leonardm

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like