SQL Server Change Data Capture (CDC)

In data warehousing, capturing changes in source data system is  one of the fundamental requirements, and something we always come across. In most scenarios, this is accomplished using a flag column or a date-time stamp column, which indicates the last changed time.However, due to some designing problems , some data sources might not contain straight forward ways ( a flag or an indicator) to identify which records and which columns have changed during certain period. Since you can not change existing source system to any new columns , you might end up in scenarios where you have to perform full loading rather than performing incremental loading, which would take more time and more resources. However, if your source system is based on MS SQL Server, and version is not later than 2008, you can use the features called CDC (Change Data Capture) or CT ( Change Tracking) to identify changed data or changed records.

In this post, I will only discuss about CDC, and hopefully, I will discuss about CT in another post in near future. According to Microsoft, SQL Server read transaction log and keep track of Insert, Update and Delete operations performed against a table if CDC is enabled for that table by a user. This is how CDC works as per the MSDN.

Picture1

How to enable CDC?  To enable CDC for a database , you have to execute “sys.sp_cdc_enable_db” stored procedure against that database. For example , if I want to enable CDC for “CDCDemo” database , I would execute following script.

USE CDCDemo
GO

–Enable CDC for database

EXEC sys.sp_cdc_enable_db;

Next step is to enable CDC for a particular table. In my case, I am going to enable CDC for dbo.SalesOrderDetail table and I am going to name my CDC instance as “SalesOrderDetailCaptureInstance” . Here is the script I  ran.

EXEC sys.sp_cdc_enable_table
@source_schema = N’dbo’,
@source_name   = N’SalesOrderDetail’,
@capture_instance =N’SalesOrderDetailCaptureInstance’,
@supports_net_changes = 1,
@role_name     = NULL
GO

Once this statement is executed, SQL server will automatically create tables to store all the changes taking place to data inside dbo.SalesOrderDetail table.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s