1.Environmental description
1.1 Source SQLSserver
Version | IP | port |
---|---|---|
Microsoft SQL Server 2017 | 192.168.140.160 | 1433 |
1.2 Target GreatSQL
Version | IP | port |
---|---|---|
GreatSQL-8.0.32 | 192.168.139.86 | 3308 |
2. Installation environment
2.1 Install SQLServer environment
Environment description : Start the database using a mirror with Docker
2.1.1Install docker
1. Install basic software packages
$ yum install -y wget net-tools nfs-utils lrzsz gcc gcc-c++ make cmake libxml2-devel openssl-devel curl curl-devel unzip sudo ntp libaio-devel wget vim ncurses-devel autoconf automake zlib-devel python-devel epel-release openssh-server socat ipvsadm conntrack yum-utils
2. Configure docker-ce domestic yum source (Alibaba Cloud)
$ yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
3. Install docker dependency packages
$ yum install -y device-mapper-persistent-data lvm2
4.Install docker-ce
$ yum install docker-ce -y
5. Start the container
$ systemctl start docker && systemctl enable docker
2.1.2 Pull the image
$ docker pull mcr.microsoft.com/mssql/server:2017-latest
2.1.3 Running the container
$ docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=********" \
-p 1433:1433 --name sqlserver2017 \
-d mcr.microsoft.com/mssql/server:2017-latest
Remember to set the password here to a complex password
Parameter explanation:
-
-e "ACCEPT_EULA=Y": Select the agreement license by default
-
-e "SA_PASSWORD=********": Set the connection password. The password cannot be too short or simple, otherwise it will not meet the sqlserver password specification and the container will stop running.
-
-p 1433:1433: The host port is mapped to the container port (the former is the host)
-
--name sqlserver2017: container alias
-
-d: run in the background
-
mcr.microsoft.com/mssql/server:2017-latest:image name:label
2.1.4 Using database
1. Enter the container
$ docker exec -it sqlserver2017 bash
2. Connect to the database
$ /opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P "********"
3. Query the database
1> select name from sys.Databases;
2> go
4.Create database
1> create database testdb;
2> go
5. Create a table and insert data
use testdb
create table t1(id int)
go
Insert into t1 values(1),(2)
go
2.2 Install GreatSQL environment
Use the Docker image to install, just pull the GreatSQL image directly
$ docker pull greatsql/greatsql
and create GreatSQL container
$ docker run -d --name greatsql --hostname=greatsql -e MYSQL_ALLOW_EMPTY_PASSWORD=1 greatsql/greatsql
2.3 Install datax
DataX installation requires dependent environments
-
JDK (1.8 or above, 1.8 recommended)
-
Python (Python2.6.X and above recommended)
Installation steps: decompress and use. However, if you run the job without performing other operations after decompression, an error will be reported.
$ cd /soft
$ ll
total 3764708
-rw-r--r-- 1 root root 853734462 Dec 9 04:06 datax.tar.gz
$ tar xf datax.tar.gz
$ python /soft/datax/bin/datax.py /soft/datax/job/job.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-07-19 11:19:17.483 [main] WARN ConfigParser - 插件[streamreader,streamwriter]加载失败,1s后重试... Exception:Code:[Common-00], Describe:[您提供的配置文件存在错误信息,请检查您的作业配置 .] - 配置信息错误,您提供的配置文件[/soft/datax/plugin/reader/._mysqlreader/plugin.json]不存在. 请检查您的配置文件.
2023-07-19 11:19:18.488 [main] ERROR Engine -
经DataX智能分析,该任务最可能的错误原因是:
com.alibaba.datax.common.exception.DataXException: Code:[Common-00], Describe:[您提供的配置文件存在错误信息,请检查您的作业配置 .] - 配置信息错误,您提供的配置文件[/soft/datax/plugin/reader/._mysqlreader/plugin.json]不存在. 请检查您的配置文件.
at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:26)
at com.alibaba.datax.common.util.Configuration.from(Configuration.java:95)
at com.alibaba.datax.core.util.ConfigParser.parseOnePluginConfig(ConfigParser.java:153)
at com.alibaba.datax.core.util.ConfigParser.parsePluginConfig(ConfigParser.java:125)
at com.alibaba.datax.core.util.ConfigParser.parse(ConfigParser.java:63)
at com.alibaba.datax.core.Engine.entry(Engine.java:137)
at com.alibaba.datax.core.Engine.main(Engine.java:204)
To solve the error: delete all files starting with ._ in the plugin directory and the plugin/reader and plugin/writer directories.
Need to delete hidden files in three directories
- plugin/
- plugin/reader/
- plugin/writer/
$ rm -rf /opt/app/datax/plugin/._*
$ rm -rf /opt/app/datax/plugin/reader/._*
$ rm -rf /opt/app/datax/plugin/writer/._*
Run a test case to check whether datax is installed successfully
$ python /soft/datax/bin/datax.py /soft/datax/job/job.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-07-19 11:22:12.298 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-07-19 11:22:12.305 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.251-b08
jvmInfo: Linux amd64 4.19.25-200.1.el7.bclinux.x86_64
cpu num: 48
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2023-07-19 11:22:12.320 [main] INFO Engine -
{"content":[{"reader":{"name":"streamreader",
"parameter":{"column":[
{"type":"string","value":"DataX"},
{"type":"long","value":19890604},
{"type":"date","value":"1989-06-04 00:00:00"},
{"type":"bool","value":true},
{"type":"bytes","value":"test"}
],"sliceRecordCount":100000}
},"writer":{"name":"streamwriter","parameter":{"encoding":"UTF-8","print":false}}}],
"setting":{"errorLimit":{"percentage":0.02,"record":0},
"speed":{"byte":10485760}}}
2023-07-19 11:22:12.336 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2023-07-19 11:22:12.337 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2023-07-19 11:22:12.338 [main] INFO JobContainer - DataX jobContainer starts job.
2023-07-19 11:22:12.339 [main] INFO JobContainer - Set jobId = 0
2023-07-19 11:22:12.352 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2023-07-19 11:22:12.352 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do prepare work .
2023-07-19 11:22:12.352 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do prepare work .
2023-07-19 11:22:12.352 [job-0] INFO JobContainer - jobContainer starts to do split ...
2023-07-19 11:22:12.353 [job-0] INFO JobContainer - Job set Max-Byte-Speed to 10485760 bytes.
2023-07-19 11:22:12.354 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] splits to [1] tasks.
2023-07-19 11:22:12.354 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] splits to [1] tasks.
2023-07-19 11:22:12.371 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2023-07-19 11:22:12.375 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2023-07-19 11:22:12.376 [job-0] INFO JobContainer - Running by standalone Mode.
2023-07-19 11:22:12.384 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-07-19 11:22:12.388 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-07-19 11:22:12.388 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2023-07-19 11:22:12.396 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-07-19 11:22:12.697 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[302]ms
2023-07-19 11:22:12.698 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-07-19 11:22:22.402 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.020s | All Task WaitReaderTime 0.033s | Percentage 100.00%
2023-07-19 11:22:22.402 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-07-19 11:22:22.402 [job-0] INFO JobContainer - DataX Writer.Job [streamwriter] do post work.
2023-07-19 11:22:22.403 [job-0] INFO JobContainer - DataX Reader.Job [streamreader] do post work.
2023-07-19 11:22:22.403 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-07-19 11:22:22.403 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /soft/datax/hook
2023-07-19 11:22:22.404 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
PS Scavenge | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
2023-07-19 11:22:22.404 [job-0] INFO JobContainer - PerfTrace not enable!
2023-07-19 11:22:22.404 [job-0] INFO StandAloneJobContainerCommunicator - Total 100000 records, 2600000 bytes | Speed 253.91KB/s, 10000 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.020s | All Task WaitReaderTime 0.033s | Percentage 100.00%
2023-07-19 11:22:22.406 [job-0] INFO JobContainer -
任务启动时刻 : 2023-07-19 11:22:12
任务结束时刻 : 2023-07-19 11:22:22
任务总计耗时 : 10s
任务平均流量 : 253.91KB/s
记录写入速度 : 10000rec/s
读出记录总数 : 100000
读写失败总数 : 0
3. SQLServer2GreatSQL full migration
3.1 Create test data from the source (SQLServer)
$ docker exec -it 47bd0ed79c26 /bin/bash
$ /opt/mssql-tools/bin/sqlcmd -S localhost -U SA -P "********"
1> create database testdb
1> use testdb
1> insert into t1 values(1),(2),(3);
2> go
1> select * from t1;
2> go
id
\-----------
1
2
3
3.2 Create table structure on the target side (GreatSQL)
greatsql> create database testdb;
greatsql> use testdb;
greatsql> create table t1 (id int primary key);
3.3 Write Datax job file
$ cat /soft/datax/job/sqlserver_to_greatsql.json
{
"job": {
"content": [
{
"reader": {
"name": "sqlserverreader",
"parameter": {
"connection": [
{
"jdbcUrl": ["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table": ["t1"]
}
],
"password": "********",
"username": "SA",
"column": ["*"]
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": "jdbc:mysql://10.17.139.86:3308/testdb",
"table": ["t1"]
}
],
"password": "******",
"session": [],
"username": "admin",
"writeMode": "insert"
}
}
}
],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
3.4 Run Datax migration task
$ python /soft/datax/bin/datax.py /soft/datax/job/sqlserver_to_greatsql.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-11-28 09:58:44.087 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-28 09:58:44.104 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.181-b13
jvmInfo: Linux amd64 3.10.0-957.el7.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2023-11-28 09:58:44.137 [main] INFO Engine -
{
"content":[
{"reader":{
"name":"sqlserverreader",
"parameter":{
"column":["*"],
"connection":[
{"jdbcUrl":["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table":["t1"]}],
"password":"*************",
"username":"SA"}},
"writer":{"name":"mysqlwriter","parameter":{"column":["*"],
"connection":[{"jdbcUrl":"jdbc:mysql://10.17.139.86:3308/testdb",
"table":["t1"]}],
"password":"********",
"session":[],
"username":"admin",
"writeMode":"insert"}}}],
"setting":{"speed":{"channel":"5"}}}
2023-11-28 09:58:44.176 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2023-11-28 09:58:44.179 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2023-11-28 09:58:44.180 [main] INFO JobContainer - DataX jobContainer starts job.
2023-11-28 09:58:44.183 [main] INFO JobContainer - Set jobId = 0
2023-11-28 09:58:44.542 [job-0] INFO OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb.
2023-11-28 09:58:44.544 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置存在一定的风险. 因为您未配置读取数据库表的列,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2023-11-28 09:58:45.099 [job-0] INFO OriginalConfPretreatmentUtil - table:[t1] all columns:[id].
2023-11-28 09:58:45.099 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置信息存在风险. 因为您配置的写入数据库表的列为*,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
2023-11-28 09:58:45.102 [job-0] INFO OriginalConfPretreatmentUtil - Write data [
insert INTO %s (id) VALUES(?)
], which jdbcUrl like:[jdbc:mysql://10..17.139.86:16310/testdb?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true]
2023-11-28 09:58:45.103 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2023-11-28 09:58:45.103 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do prepare work
2023-11-28 09:58:45.104 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2023-11-28 09:58:45.104 [job-0] INFO JobContainer - jobContainer starts to do split ...
2023-11-28 09:58:45.105 [job-0] INFO JobContainer - Job set Channel-Number to 5 channels.
2023-11-28 09:58:45.112 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] splits to [1] tasks.
2023-11-28 09:58:45.114 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2023-11-28 09:58:45.135 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2023-11-28 09:58:45.139 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2023-11-28 09:58:45.142 [job-0] INFO JobContainer - Running by standalone Mode.
2023-11-28 09:58:45.151 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-28 09:58:45.157 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-28 09:58:45.158 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-28 09:58:45.173 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-11-28 09:58:45.181 [0-0-0-reader] INFO CommonRdbmsReader$Task - Begin to read record by Sql: [select * from t1
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 09:58:45.398 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select * from t1
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 09:58:45.454 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[284]ms
2023-11-28 09:58:45.455 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-28 09:58:55.175 [job-0] INFO StandAloneJobContainerCommunicator - Total 3 records, 3 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 09:58:55.175 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-11-28 09:58:55.175 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2023-11-28 09:58:55.176 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do post work.
2023-11-28 09:58:55.176 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-11-28 09:58:55.176 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /soft/datax/hook
2023-11-28 09:58:55.177 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 1 | 1 | 1 | 0.061s | 0.061s | 0.061s
PS Scavenge | 1 | 1 | 1 | 0.039s | 0.039s | 0.039s
2023-11-28 09:58:55.177 [job-0] INFO JobContainer - PerfTrace not enable!
2023-11-28 09:58:55.177 [job-0] INFO StandAloneJobContainerCommunicator - Total 3 records, 3 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 09:58:55.179 [job-0] INFO JobContainer -
任务启动时刻 : 2023-11-28 09:58:44
任务结束时刻 : 2023-11-28 09:58:55
任务总计耗时 : 10s
任务平均流量 : 0B/s
记录写入速度 : 0rec/s
读出记录总数 : 3
读写失败总数 : 0
3.5 Verify data at the target end
greatsql> select * from t1;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
+----+
3 rows in set (0.01 sec)
4. SQLServer to GreatSQL incremental migration
4.1 Create test data on the source side (SQLServer)
2> create table t2 (id int,createtime datetime);
3> go
1> insert into t2 values(1,GETDATE());
2> g
(1 rows affected)
1> insert into t2 values(2,GETDATE());
2> go
(1 rows affected)
1> insert into t2 values(3,GETDATE());
2> go
(1 rows affected)
1> insert into t2 values(4,GETDATE());
2> go
(1 rows affected)
1> insert into t2 values(5,GETDATE());
2> go
(1 rows affected)
1> insert into t2 values(6,GETDATE());
2> go
(1 rows affected)
1> select * from t2;
2> go
id createtime
---------- -----------------------
1 2023-11-28 02:18:20.790
2 2023-11-28 02:18:27.040
3 2023-11-28 02:18:32.103
4 2023-11-28 02:18:37.690
5 2023-11-28 02:18:41.450
6 2023-11-28 02:18:46.330
4.2 Write the full migration job file of Datax
$ cat sqlserver_to_greatsql_inc.json
{
"job": {
"content": [
{
"reader": {
"name": "sqlserverreader",
"parameter": {
"connection": [
{
"jdbcUrl": ["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table": ["t2"]
}
],
"password": "********",
"username": "SA",
"column": ["*"]
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": "jdbc:mysql://10.17.139.86:3308/testdb",
"table": ["t2"]
}
],
"password": "!QAZ2wsx",
"session": [],
"username": "admin",
"writeMode": "insert"
}
}
}
],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
4.3 Run Datax full migration task
$ python /soft/datax/bin/datax.py /soft/datax/job/sqlserver_to_greatsql_inc.json
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-11-28 10:19:59.279 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-28 10:19:59.286 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.181-b13
jvmInfo: Linux amd64 3.10.0-957.el7.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2023-11-28 10:19:59.302 [main] INFO Engine -
{"content":[{"reader":{"name":"sqlserverreader","parameter":{"column":[
"*"],"connection":[{"jdbcUrl":["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table":["t2"]}],"password":"*************","username":"SA"}},
"writer":{"name":"mysqlwriter","parameter":{"column":["*"],
"connection":[{"jdbcUrl":"jdbc:mysql://10..17.139.86:16310/testdb","table":["t2"]}],
"password":"********",
"session":[],
"username":"admin",
"writeMode":"insert"}}}],
"setting":{"speed":{"channel":"5"}}}
2023-11-28 10:19:59.319 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2023-11-28 10:19:59.321 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2023-11-28 10:19:59.321 [main] INFO JobContainer - DataX jobContainer starts job.
2023-11-28 10:19:59.324 [main] INFO JobContainer - Set jobId = 0
2023-11-28 10:19:59.629 [job-0] INFO OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb.
2023-11-28 10:19:59.630 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置存在一定的风险. 因为您未配置读取数据库表的列,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2023-11-28 10:20:00.027 [job-0] INFO OriginalConfPretreatmentUtil - table:[t2] all columns:[
id,createtime].
2023-11-28 10:20:00.027 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置信息存在风险. 因为您配置的写入数据库表的列为*,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
2023-11-28 10:20:00.029 [job-0] INFO OriginalConfPretreatmentUtil - Write data [
insert INTO %s (id,createtime) VALUES(?,?)
], which jdbcUrl like:[jdbc:mysql://10..17.139.86:16310/testdb?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true]
2023-11-28 10:20:00.030 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2023-11-28 10:20:00.031 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do prepare work .
2023-11-28 10:20:00.031 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2023-11-28 10:20:00.032 [job-0] INFO JobContainer - jobContainer starts to do split ...
2023-11-28 10:20:00.032 [job-0] INFO JobContainer - Job set Channel-Number to 5 channels.
2023-11-28 10:20:00.037 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] splits to [1] tasks.
2023-11-28 10:20:00.038 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2023-11-28 10:20:00.060 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2023-11-28 10:20:00.063 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2023-11-28 10:20:00.066 [job-0] INFO JobContainer - Running by standalone Mode.
2023-11-28 10:20:00.073 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-28 10:20:00.080 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-28 10:20:00.080 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-28 10:20:00.093 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-11-28 10:20:00.101 [0-0-0-reader] INFO CommonRdbmsReader$Task - Begin to read record by Sql: [select * from t2
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 10:20:00.262 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select * from t2
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 10:20:00.334 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[243]ms
2023-11-28 10:20:00.335 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-28 10:20:10.087 [job-0] INFO StandAloneJobContainerCommunicator - Total 6 records, 54 bytes | Speed 5B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 10:20:10.088 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-11-28 10:20:10.088 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2023-11-28 10:20:10.089 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do post work.
2023-11-28 10:20:10.090 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-11-28 10:20:10.091 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /soft/datax/hook
2023-11-28 10:20:10.094 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 1 | 1 | 1 | 0.034s | 0.034s | 0.034s
PS Scavenge | 1 | 1 | 1 | 0.031s | 0.031s | 0.031s
2023-11-28 10:20:10.094 [job-0] INFO JobContainer - PerfTrace not enable!
2023-11-28 10:20:10.095 [job-0] INFO StandAloneJobContainerCommunicator - Total 6 records, 54 bytes | Speed 5B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 10:20:10.097 [job-0] INFO JobContainer -
任务启动时刻 : 2023-11-28 10:19:59
任务结束时刻 : 2023-11-28 10:20:10
任务总计耗时 : 10s
任务平均流量 : 5B/s
记录写入速度 : 0rec/s
读出记录总数 : 6
读写失败总数 : 0
4.4 Verify all migrated data
greatsql> select * from t2;
+----+---------------------+
| id | createtime |
+----+---------------------+
| 1 | 2023-11-28 02:18:21 |
| 2 | 2023-11-28 02:18:27 |
| 3 | 2023-11-28 02:18:32 |
| 4 | 2023-11-28 02:18:38 |
| 5 | 2023-11-28 02:18:41 |
| 6 | 2023-11-28 02:18:46 |
+----+---------------------+
You can also use checksum table x to verify. For large tables, you cannot select the entire table *
4.5 Insert incremental data from the source (SQLServer)
2> insert into t2 values(7,'202311-28 03:18:46.330');
3> go
Changed database context to 'jem_db'.
(1 rows affected)
1> insert into t2 values(8,'2023-11-28 03:20:46.330');
2> go
(1 rows affected)
1> insert into t2 values(9,'2023-11-28 03:25:46.330');
2> go
(1 rows affected)
1> insert into t2 values(10,'2023-11-28 03:30:46.330');
2> go
(1 rows affected)
1> select * from t2;
2> go
id createtime
----------- -----------------------
1 2023-11-28 02:18:20.790
2 2023-11-28 02:18:27.040
3 2023-11-28 02:18:32.103
4 2023-11-28 02:18:37.690
5 2023-11-28 02:18:41.450
6 2023-11-28 02:18:46.330
7 2023-11-28 03:18:46.330
8 2023-11-28 03:20:46.330
9 2023-11-28 03:25:46.330
10 2023-11-28 03:30:46.330
4.6 Write Datax incremental migration job file
$ cat sqlserver_to_greatsql_inc.json
{
"job": {
"content": [
{
"reader": {
"name": "sqlserverreader",
"parameter": {
"connection": [
{
"jdbcUrl": ["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table": ["t2"]
}
],
"password": "********",
"username": "SA",
"column": ["*"],
"where":"createtime > '${start_time}' and createtime < '${end_time}'"
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"column": ["*"],
"connection": [
{
"jdbcUrl": "jdbc:mysql://10..17.139.86:16310/testdb",
"table": ["t2"]
}
],
"password": "!QAZ2wsx",
"session": [],
"username": "admin",
"writeMode": "insert"
}
}
}
],
"setting": {
"speed": {
"channel": "5"
}
}
}
}
4.7 Run Datax incremental migration task
$ python /soft/datax/bin/datax.py /soft/datax/job/sqlserver_to_mysql_inc.json -p "-Dstart_time='2023-11-28 03:17:46.330' -Dend_time='2023-11-28 03:31:46.330'"
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
2023-11-28 10:29:24.492 [main] INFO VMInfo - VMInfo# operatingSystem class => sun.management.OperatingSystemImpl
2023-11-28 10:29:24.504 [main] INFO Engine - the machine info =>
osInfo: Oracle Corporation 1.8 25.181-b13
jvmInfo: Linux amd64 3.10.0-957.el7.x86_64
cpu num: 8
totalPhysicalMemory: -0.00G
freePhysicalMemory: -0.00G
maxFileDescriptorCount: -1
currentOpenFileDescriptorCount: -1
GC Names [PS MarkSweep, PS Scavenge]
MEMORY_NAME | allocation_size | init_size
PS Eden Space | 256.00MB | 256.00MB
Code Cache | 240.00MB | 2.44MB
Compressed Class Space | 1,024.00MB | 0.00MB
PS Survivor Space | 42.50MB | 42.50MB
PS Old Gen | 683.00MB | 683.00MB
Metaspace | -0.00MB | 0.00MB
2023-11-28 10:29:24.524 [main] INFO Engine -
{"content":[{"reader":{"name":"sqlserverreader","parameter":{"column":["*"],
"connection":[{"jdbcUrl":["jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb"],
"table":["t2"]}],"password":"*************","username":"SA",
"where":"createtime > '2023-11-28 03:17:46.330' and createtime < '2023-11-28 03:31:46.330'"}},
"writer":{"name":"mysqlwriter","parameter":{"column":["*"],"connection":[{"jdbcUrl":"jdbc:mysql://10..17.139.86:16310/testdb","table":["t2"]}],
"password":"********",
"session":[],
"username":"admin",
"writeMode":"insert"}}}],
"setting":{"speed":{"channel":"5"}}}
2023-11-28 10:29:24.542 [main] WARN Engine - prioriy set to 0, because NumberFormatException, the value is: null
2023-11-28 10:29:24.544 [main] INFO PerfTrace - PerfTrace traceId=job_-1, isEnable=false, priority=0
2023-11-28 10:29:24.544 [main] INFO JobContainer - DataX jobContainer starts job.
2023-11-28 10:29:24.546 [main] INFO JobContainer - Set jobId = 0
2023-11-28 10:29:24.830 [job-0] INFO OriginalConfPretreatmentUtil - Available jdbcUrl:jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb.
2023-11-28 10:29:24.831 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置存在一定的风险. 因为您未配置读取数据库表的列,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2023-11-28 10:29:25.113 [job-0] INFO OriginalConfPretreatmentUtil - table:[t2] all columns:[id,createtime].
2023-11-28 10:29:25.113 [job-0] WARN OriginalConfPretreatmentUtil - 您的配置文件中的列配置信息存在风险. 因为您配置的写入数据库表的列为*,当您的表字段个数、类型有变动时,可能影响任务正确性甚至会运行出错。请检查您的配置并作出修改.
2023-11-28 10:29:25.115 [job-0] INFO OriginalConfPretreatmentUtil - Write data [
insert INTO %s (id,createtime) VALUES(?,?)
], which jdbcUrl like:[jdbc:mysql://10..17.139.86:16310/testdb?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true]
2023-11-28 10:29:25.116 [job-0] INFO JobContainer - jobContainer starts to do prepare ...
2023-11-28 10:29:25.117 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do prepare work .
2023-11-28 10:29:25.117 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do prepare work .
2023-11-28 10:29:25.118 [job-0] INFO JobContainer - jobContainer starts to do split ...
2023-11-28 10:29:25.118 [job-0] INFO JobContainer - Job set Channel-Number to 5 channels.
2023-11-28 10:29:25.123 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] splits to [1] tasks.
2023-11-28 10:29:25.124 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] splits to [1] tasks.
2023-11-28 10:29:25.146 [job-0] INFO JobContainer - jobContainer starts to do schedule ...
2023-11-28 10:29:25.150 [job-0] INFO JobContainer - Scheduler starts [1] taskGroups.
2023-11-28 10:29:25.153 [job-0] INFO JobContainer - Running by standalone Mode.
2023-11-28 10:29:25.159 [taskGroup-0] INFO TaskGroupContainer - taskGroupId=[0] start [1] channels for [1] tasks.
2023-11-28 10:29:25.165 [taskGroup-0] INFO Channel - Channel set byte_speed_limit to -1, No bps activated.
2023-11-28 10:29:25.165 [taskGroup-0] INFO Channel - Channel set record_speed_limit to -1, No tps activated.
2023-11-28 10:29:25.176 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] attemptCount[1] is started
2023-11-28 10:29:25.183 [0-0-0-reader] INFO CommonRdbmsReader$Task - Begin to read record by Sql: [select * from t2 where (createtime > '2023-11-28 03:17:46.330' and createtime < '2023-11-28 03:31:46.330')
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 10:29:25.344 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select * from t2 where (createtime > '2023-11-28 03:17:46.330' and createtime < '2023-11-28 03:31:46.330')
] jdbcUrl:[jdbc:sqlserver://127.0.0.1:1433;DatabaseName=testdb].
2023-11-28 10:29:25.606 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[431]ms
2023-11-28 10:29:25.607 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-11-28 10:29:35.173 [job-0] INFO StandAloneJobContainerCommunicator - Total 4 records, 37 bytes | Speed 3B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 10:29:35.173 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-11-28 10:29:35.174 [job-0] INFO JobContainer - DataX Writer.Job [mysqlwriter] do post work.
2023-11-28 10:29:35.175 [job-0] INFO JobContainer - DataX Reader.Job [sqlserverreader] do post work.
2023-11-28 10:29:35.175 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-11-28 10:29:35.177 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /soft/datax/hook
2023-11-28 10:29:35.179 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 1 | 1 | 1 | 0.052s | 0.052s | 0.052s
PS Scavenge | 1 | 1 | 1 | 0.024s | 0.024s | 0.024s
2023-11-28 10:29:35.180 [job-0] INFO JobContainer - PerfTrace not enable!
2023-11-28 10:29:35.181 [job-0] INFO StandAloneJobContainerCommunicator - Total 4 records, 37 bytes | Speed 3B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 100.00%
2023-11-28 10:29:35.183 [job-0] INFO JobContainer -
任务启动时刻 : 2023-11-28 10:29:24
任务结束时刻 : 2023-11-28 10:29:35
任务总计耗时 : 10s
任务平均流量 : 3B/s
记录写入速度 : 0rec/s
读出记录总数 : 4
读写失败总数 : 0
4.8 Verify incremental data to the target (GreatSQL)
greatsql> select * from t2;
+----+---------------------+
| id | createtime |
+----+---------------------+
| 1 | 2023-11-28 02:18:21 |
| 2 | 2023-11-28 02:18:27 |
| 3 | 2023-11-28 02:18:32 |
| 4 | 2023-11-28 02:18:38 |
| 5 | 2023-11-28 02:18:41 |
| 6 | 2023-11-28 02:18:46 |
| 7 | 2023-11-28 03:18:46 |
| 8 | 2023-11-28 03:20:46 |
| 9 | 2023-11-28 03:25:46 |
| 10 | 2023-11-28 03:30:46 |
+----+---------------------+
10 rows in set (0.00 sec)
Summary of incremental migration: The purpose of incremental migration is achieved by adding filter conditions. Mainly through filtering conditions, the full amount of migrated data is filtered out, and then incremental migration is completed in disguise.
Enjoy GreatSQL :)
About GreatSQL
GreatSQL is a domestic independent open source database suitable for financial-level applications. It has many core features such as high performance, high reliability, high ease of use, and high security. It can be used as an optional replacement for MySQL or Percona Server and is used in online production environments. , completely free and compatible with MySQL or Percona Server.
Related links: GreatSQL Community Gitee GitHub Bilibili
GreatSQL Community:
Community reward suggestions and feedback: https://greatsql.cn/thread-54-1-1.html
Community blog prize-winning submission details: https://greatsql.cn/thread-100-1-1.html
(If you have any questions about the article or have unique insights, you can go to the official community website to ask or share them~)
Technical exchange group:
WeChat & QQ group:
QQ group: 533341697
WeChat group: Add GreatSQL Community Assistant (WeChat ID: wanlidbc
) as a friend and wait for the community assistant to add you to the group.