SQL server 2016 开启CDC功能 捕获变更数据

版权声明:本文为博主原创文章,转载请注明出处 https://blog.csdn.net/vkingnew/article/details/89508885

CDC适用的环境:

1.SQL server 2008版本以上的企业版、开发版和评估版中可用;

2.需要开启代理服务(作业)。

3.CDC需要业务库之外的额外的磁盘空间。

4.CDC的表需要主键或者唯一主键。

SQL server的CDC的流程:

CDC的功能介绍:

开启cdc的源表在插入INSERT、更新UPDATE和删除DELETE活动时会插入数据到日志表中。cdc通过捕获进程将变更数据捕获到变更表中,通过cdc提供的查询函数,我们可以捕获这部分数据。 

CDC的表不能truncate操作,truncate是物理删除数据不能捕获变更的数据。

Cannot truncate table 'test' because it is published for replication or enabled for Change Data Capture.

开启CDC的步骤:

0.确保开启SQL server agent服务:

若不开启的话会报错:
SQLServerAgent is not currently running so it cannot be notified of this action.
--解决办法:
sp_configure 'show advanced options', 1;   
GO   
RECONFIGURE;   
GO   
sp_configure 'Agent XPs', 1;   
GO   
RECONFIGURE   
GO 

 也可以通过图形界面操作:

开始--> [win + R] --> services.msc -- 启动SQL server 代理服务。

1.开启数据库级别的CDC功能:

if exists(select 1 from sys.databases where name='ERP' and is_cdc_enabled=0)
begin
    exec sys.sp_cdc_enable_db
end

或者
USE ERP
GO  
-- 开启:
EXEC sys.sp_cdc_enable_db  
-- 关闭:
EXEC sys.sp_cdc_disable_db
GO  

查询验证数据是否开启CDC功能:

select is_cdc_enabled from sys.databases where name='ERP';

若返回的值是0 表示CDC是禁用的,1表示CDC是开启的。

2.添加CDC专用的文件组和文件:

数据库ERP--右键  “属性” >> “文件组”>> ”添加文件组”

上述步骤也可以通过SQL命令操作:
-- 查询某个库的物理文件:
SELECT name, physical_name FROM sys.master_files WHERE database_id = DB_ID('ERP');

name	physical_name
ERP	    C:\Program Files\Microsoft SQL Server\MSSQL13.DEV\MSSQL\DATA\ERP.mdf
ERP_log	C:\Program Files\Microsoft SQL Server\MSSQL13.DEV\MSSQL\DATA\ERP_log.ldf
ERP_CDC	C:\Program Files\Microsoft SQL Server\MSSQL13.DEV\MSSQL\DATA\ERP_CDC.ndf


1.添加文件组:
ALTER DATABASE ERP ADD FILEGROUP CDC1;

或者
(FILESTREAM 功能被禁用,需要开启)
ALTER DATABASE ERP ADD FILEGROUP CDC CONTAINS FILESTREAM;

开启FILESTREAM 功能:
EXEC sp_configure filestream_access_level, 2  
RECONFIGURE  


2.将新增文件,并映射到文件组:
ALTER DATABASE ERP
ADD FILE
(
  NAME= 'ERP_CDC1',
  FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.DEV\MSSQL\DATA\ERP_CDC1.ndf'
)
TO FILEGROUP CDC1;

3.查询验证:
SELECT name, physical_name FROM sys.master_files WHERE database_id = DB_ID('ERP');

name	physical_name
ERP	    C:\Program Files\Microsoft SQL Server\MSSQL13.DEV\MSSQL\DATA\ERP.mdf
ERP_log	C:\Program Files\Microsoft SQL Server\MSSQL13.DEV\MSSQL\DATA\ERP_log.ldf
ERP_CDC	C:\Program Files\Microsoft SQL Server\MSSQL13.DEV\MSSQL\DATA\ERP_CDC.ndf
ERP_CDC1	C:\Program Files\Microsoft SQL Server\MSSQL13.DEV\MSSQL\DATA\ERP_CDC1.ndf

3.开启表级别的CDC:

--模板:

EXEC sys.sp_cdc_enable_table
        @source_schema = 'dbo', -- source_schema
        @source_name = 'table_name', -- table_name
        @capture_instance = NULL, -- capture_instance
        @supports_net_changes = 1, -- supports_net_changes
        @role_name = NULL, -- role_name
        @index_name = NULL, -- index_name
        @captured_column_list = NULL, -- captured_column_list
        @filegroup_name = 'CDC' -- filegroup_name;


或者:

IF EXISTS(SELECT 1 FROM sys.tables WHERE name='table_name' AND is_tracked_by_cdc = 0)
BEGIN
    EXEC sys.sp_cdc_enable_table
        @source_schema = 'dbo', -- source_schema
        @source_name = 'table_name', -- table_name
        @capture_instance = NULL, -- capture_instance
        @supports_net_changes = 1, -- supports_net_changes
        @role_name = NULL, -- role_name
        @index_name = NULL, -- index_name
        @captured_column_list = NULL, -- captured_column_list
        @filegroup_name = 'CDC' -- filegroup_name
END

在实际生产中有多个表进行设置可以考虑使用游标批量设置。

示例:

EXEC sys.sp_cdc_enable_table  @source_schema = 'dbo',@source_name = 'test',@capture_instance = NULL, @supports_net_changes = 1,@role_name = NULL, @index_name = NULL,@captured_column_list = NULL, @filegroup_name = 'CDC';

查询开启了哪些表被开启了CDC功能:

select name,type,create_date,modify_date,is_tracked_by_cdc from sys.tables where is_tracked_by_cdc = 1;

4.创建测试表:

 create table test(id varchar(36) not null primary key,city_name varchar(20),userid bigint,useramount decimal(18,6),ismaster bit,createtime datetime default getdate());

 EXEC sys.sp_cdc_enable_table  @source_schema = 'dbo',@source_name = 'test',@capture_instance = NULL, @supports_net_changes = 1,@role_name = NULL, @index_name = NULL,@captured_column_list = NULL, @filegroup_name = 'CDC';

此时会自动生成定时任务:

--显示原有配置:

EXEC sp_cdc_help_jobs
GO

--更改数据保留时间为分钟

EXECUTE sys.sp_cdc_change_job
    @job_type = N'cleanup',
    @retention=100
GO

--停用作业

EXEC sys.sp_cdc_stop_jobN'cleanup'
GO

--启用作业

EXEC sys.sp_cdc_start_jobN'cleanup'
GO

--再次查看
EXEC sp_cdc_help_jobs
GO

--停用作业

EXEC sys.sp_cdc_stop_jobN'cleanup'

GO

--启用作业

EXEC sys.sp_cdc_start_jobN'cleanup'

GO

EXEC sys.sp_cdc_drop_job@job_type = N'cleanup' -- nvarchar(20)

GO

--查看作业

EXEC sys.sp_cdc_help_jobs

GO 

对要捕获的表进行DML操作和DDL操作测试:

 -- INSERT:
insert into test(id,city_name,userid,useramount,ismaster)values('1','wuhan',     10,1000.25,1);
insert into test(id,city_name,userid,useramount,ismaster)values('1A','xiangyang',11,11000.35,0);
insert into test(id,city_name,userid,useramount,ismaster)values('1B','yichang',  12,12000.45,0);


--查询:
select *  from erp.dbo.test;
id	city_name	userid	useramount	ismaster	createtime
1	wuhan	10	1000.250000	1	2019-04-25 09:59:22.250
1A	xiangyang	11	11000.350000	0	2019-04-25 09:59:22.250
1B	yichang	12	12000.450000	0	2019-04-25 09:59:22.250


--查询捕获的数据:


 

 说明:
cdc.<capture_instance>_CT   可以看到,这样命名的表,是用于记录源表更改的表。
对于insert/delete操作,会有对应的一行记录,而对于update,会有两行记录。

对于__$operation列:1 = 删除、2= 插入、3= 更新(旧值)、4= 更新(新值)

对于__$start_lsn列:由于更改是来源与数据库的事务日志,所以这里会保存其事务日志的开始序列号(LSN)

 做update和delete操作:

 DDL操作:DDL操作需要重新收集表的信息.

新增字段:

alter  table test add   product_count decimal(18,2);
insert into test(id,city_name,userid,useramount,ismaster,product_count)values('2','wuhan',     20,2000.25,1,2.5);
select * from erp.cdc.dbo_test_ct c where c.id='2';

可以看到新增的字段数据没有捕获到:
__$start_lsn	__$end_lsn	__$seqval	__$operation	__$update_mask	id	city_name	userid	useramount	ismaster	createtime
0x00000023000004890003	NULL	0x00000023000004890002	2	0x3F	2	wuhan	20	2000.250000	1	2019-04-25 10:12:24.890


需要重新收集表的定义信息:

注意:
源表 'dbo.test' 已存在两个捕获实例。一个表最多只能有两个捕获实例。如果当前跟踪选项不合适,请使用 sys.sp_cdc_disable_table 禁用过时实例的更改跟踪,然后重试操作。

--加字段之后重新CDC:
EXEC sys.sp_cdc_enable_table  @source_schema = 'dbo',@source_name = 'test',@capture_instance ='dbo_test_v2', 
@supports_net_changes = 1,@role_name = NULL, @index_name = NULL,@captured_column_list = NULL, @filegroup_name = 'CDC';

insert into test(id,city_name,userid,useramount,ismaster,product_count)values('2A','xiangyang',21,121000.35,0,12.5);
select * from erp.cdc.dbo_test_v2_ct c where c.id='2A';

可以看到查询到新增的字段:
__$start_lsn	__$end_lsn	__$seqval	__$operation	__$update_mask	id	city_name	userid	useramount	ismaster	createtime	product_count
0x0000002300000C530003	NULL	0x0000002300000C530002	2	0x7F	2A	xiangyang	21	121000.350000	0	2019-04-25 10:16:47.080	12.50

删除字段:

-- 删除字段:
alter table test  drop column city_name;


insert into test(id,userid,useramount,ismaster,product_count)values('3',31,300.25,1,33.5);
select * from erp.cdc.dbo_test_ct c where c.id='3';
select * from erp.cdc.dbo_test_v2_ct c where c.id='3';

查询结果为删除的字段赋值为NULL.


由于一个表只能有2个CT表,删除字段后需要重新收集表的信息:

EXEC sys.sp_cdc_disable_table @source_schema = 'dbo',@source_name = 'test', @capture_instance = 'dbo_test';
EXEC sys.sp_cdc_enable_table  @source_schema = 'dbo',@source_name = 'test',@capture_instance ='dbo_test', 
@supports_net_changes = 1,@role_name = NULL, @index_name = NULL,@captured_column_list = NULL, @filegroup_name = 'CDC';

insert into test(id,userid,useramount,ismaster,product_count)values('4',41,400.25,1,43.5);

select * from erp.cdc.dbo_test_CT


__$start_lsn	__$end_lsn	__$seqval	__$operation	__$update_mask	id	userid	useramount	ismaster	createtime	product_count
0x0000002400000DA40003	NULL	0x0000002400000DA40002	2	0x3F	4	41	400.250000	1	2019-04-25 10:23:45.123	43.50

可以看到新收集的表已经没有字段CITY_NAME.

而表以前则则记录的city_name 字段为NULL:

select * from erp.cdc.dbo_test_v2_CT  c where c.id='4';

__$start_lsn	__$end_lsn	__$seqval	__$operation	__$update_mask	id	city_name	userid	useramount	ismaster	createtime	product_count
0x0000002400000DA40003	NULL	0x0000002400000DA40002	2	0x7F	4	NULL	41	400.250000	1	2019-04-25 10:23:45.123	43.50

总结:删除字段可以用不用重新收集表的定义信息。

整个CDC会创建一堆的系统表、视图、存储过程和作业来实现CDC功能:

参考:

https://docs.microsoft.com/en-us/sql/t-sql/statements/alter-database-transact-sql-file-and-filegroup-options?view=sql-server-2017

https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/about-change-data-capture-sql-server?view=sql-server-2017

猜你喜欢

转载自blog.csdn.net/vkingnew/article/details/89508885