Handling a DDL Change for SQL Server CDC Enabled Tables
How can we ensure that any new columns added to a table where Change Data Capture (CDC) is enabled will be synchronized to its target?
Any SQL Server database where CDC is turned on and your SQL Server source connector uses CDC as the incremental update configuration.
Unfortunately, SQL Server does not automatically track certain DDL changes in the CDC log after CDC was enabled on the table. This includes adding new columns to a table after CDC was enabled for the table.
DLH.io aligns with the integrity of each database system, and will not pull a new column into the synchronization process if it does not belong to the tracked table. This is very important so that your synchronization targets are in line with your database policies.
In order to update the CDC configuration to include new DDL changes (ex: new columns typically done through a DDL command such as, ALTER TABLE <table_name> ADD COLUMN <column_name><data_type>), in the CDC log, you must create a new CDC instance to replace the existing CDC instance on the table.
Any one table can have a max of two CDC instances assigned to the table. It is best practice to only have one instance. So, basically you will enable a new instance and disable the old one. Like most database systems SQL Server itself will almost always loose any previous pointer the CDC logic is using to track delta records used in the synchronization process. Therefore a re-sync of the table will be required for consistency.
Here are the conceptual steps you and your team will take when new columns and other DDL changes in a existing CDC tracked table are made:
- Conduct the DDL operation on the table in question (ex: add column)
- Now confirm that the column DDL operation was successful by viewing you table, you can use the sp_help command for this or your IDE:
- exec sp_help 'schema.table_name'
- Now assert using the CDC reference query here that the CDC capture column log is ignoring your newly added column or DDL change, since you will not see it in the list because you ran a DDL operation on an already CDC enabled table:
- SELECT * FROM cdc.captured_columns order by column_name
- Verify some information regarding the capture instance of the table, by running the following command, but specifically look for the following fields in the returned result set. capture_instance and start_lsn:
- EXEC sys.sp_cdc_help_change_data_capture @source_schema = '<schema_name>', @source_name = '<table_name>';
- Replace the placeholders with the name of your actual schema and table (source_name attribute), for example, EXEC sys.sp_cdc_help_change_data_capture @source_schema = 'Person', @source_name = 'Address';
- Again the above just gives you details that will help you understand the CDC table instance
- You can also view what current events are in the CDC tracking table for the CDC enabled table by running a query such as:
- SELECT * FROM cdc<schema>_<table>_CT if using the basic capture instance naming, for example,
- SELECT * FROM cdc.Person_Address_CT
- Disable the existing SQL Server CDC instance for the table being modified by executing the following command:
- EXECUTE sys.sp_cdc_disable_table @source_schema = N'<schema_name>', @source_name = N'<table_name>', @capture_instance = N'all'; GO
- Replace the placeholders with the name of your actual schema and table (source_name attribute), for example, EXECUTE sys.sp_cdc_disable_table @source_schema = N'person', @source_name = N'address', @capture_instance = N'all'; GO Once the command completes you can run the command from above, sys.sp_cdc_help_change_data_capture and notice that the result set is not available as this tracking logic has been removed.
- Create a new SQL Server CDC instance for the table that was modified by executing the following command:
- EXEC sys.sp_cdc_enable_table @source_schema = N'<schema_name>', @source_name = N'<table_name>' GO
- Replace the placeholders with the name of your actual schema and table (source_name attribute), for example, EXEC sys.sp_cdc_enable_table @source_schema = N'Person', @source_name = N'Address' GO Be sure here that the command completes. Also, please refer to your original CDC logic as there may have been other parameters such as @supports_net_changes that you wanted to incorporate. Without those parameters you will receive the defaults which work for 99% of all customers we believe.
- Run the command from the second step above to now verify that there is a result set for the new tracking capture_instance.
- In DLH.io run a Historical Re-Sync on this table to complete the operation.
Your SQL Server database administrator (DBA) is best suited for the job.
Use your support credits with us by contacting your engagement manager or using the support portal.
Incorporate the instruction in the answer above anytime you seem to be missing a column in the target where your DBA has confirmed that the table in question is using the CDC (not Change Tracking) method. DLH.io will pass alerts if the final synchronization to the target destination table fails because of this or the retrieval of data cannot be accomplished because of this Microsoft SQL Server feature of their CDC configuration.