DataX and DB2 import and export case

DataX and DB2 import and export case



0. write in front

  • Linux version:CentOS-7.5-x86_64-DVD-1804
  • DB2 version: LINUXX8664 11.5.4.0(node02 machine)
  • DataX version:
  • Python version:Python 2.7.5
  • DataX schema: 单机version (node01 machine)
  • turn off firewall
  • close SELinux
  • Configure local yum source

1. Introduction to DB2

关系型数据库系统DB2 is a Relational Database Management System developed by IBM in 1983. It is mainly used in large-scale application systems and has good scalability. DB2 is the second relational database launched by IBM, so it is called db2. DB2 provides high-level data utilization, integrity, security, parallelism, recoverability, and small-scale to large-scale application execution capabilities, with platform-independent basic functions and SQL command execution environment. It can be used simultaneously on different operating systems, including Linux, UNIX and Windows.

2. DB2 database object relationship

  • instance, multiple DB2 instances can be installed on the same machine.

  • database, multiple databases can be created under the same instance.

  • schema, multiple schemas can be configured under the same database.

  • table, multiple tables can be created under the same schema.

3. Preparation before installation

3.1 Installation dependencies

sudo yum install -y bc binutils compat-libcap1 compat-libstdc++33 elfutils-libelf elfutils-libelf-devel fontconfig-devel glibc glibc-devel ksh libaio libaio-devel libX11 libXau libXi libXtst libXrender libXrender-devel libgcc libstdc++ libstdc++- devel libxcb make smartmontools sysstat kmod* gcc-c++ compat-libstdc++-33 libstdc++.so.6 kernel-devel pam-devel.i686 pam.i686 pam32*	

3.2 Modify the configuration file sysctl.conf

[root@node02 module]# vim /etc/sysctl.conf

Delete the content inside and add the following content:

net.ipv4.ip_local_port_range = 9000 65500
fs.file-max = 6815744
kernel.shmall = 10523004
kernel.shmmax = 6465333657
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128 
net.core.rmem_default=262144 
net.core.wmem_default=262144 
net.core.rmem_max=4194304 
net.core.wmem_max=1048576 
fs.aio-max-nr = 1048576

3.3 Modify the configuration file limits.conf

[root@node02 module]# vim /etc/security/limits.conf

Add at the end of the file:

*	soft nproc 65536
*	hard nproc 65536
*	soft nofile 65536
*	hard nofile 65536

Note: restart node02 to take effect

4. Installation

4.1 Pre-check

  • Execute the following command to start the pre-check
[root@node02 server_dec]# ./db2prereqcheck -l -s
需求与 Db2 数据库 "Server"  不匹配。版本:"11.5.4.0"。 当前系统上未满足的先决条件的摘要:    DBT3514W  db2prereqcheck 实用程序未能找到以下 32 位库文件:"/lib/libpam.so*"。


需求与 Db2 数据库 "Server" 带 pureScale 功能部件  不匹配。版本:"11.5.4.0"。 当前系统上未满足的先决条件的摘要: DBT3613E  db2prereqcheck 实用程序无法验证对应 TSA 的先决条件。请确保您的机器满足所有 TSA 安装先决条件。

DBT3507E  db2prereqcheck 实用程序未能找到以下程序包或文件:"kernel-source"

An Error appears: Missing「 32 位库文件:"/lib/libpam.so*"」

  • Regarding this Error, please check pamwhether the related dependencies are installed successfully
[root@node02 server_dec]# rpm -qa | grep pam
pam-1.1.8-22.el7.x86_64
[root@node02 server_dec]# rpm -qa | grep pam-devel

No pam-devel, reinstall dependencies

[root@node02 server_dec]# yum install -y pam-devel.i686
  • pre-check again
[root@node02 server_dec]# ./db2prereqcheck -l -s
DBT3533I  db2prereqcheck 实用程序已确认所有安装先决条件均已满足。 需求与 Db2 数据库 "Server" 带 pureScale 功能部件  不匹配。版本:"11.5.4.0"。 当前系统上未满足的先决条件的摘要: DBT3613E  db2prereqcheck 实用程序无法验证对应 TSA 的先决条件。请确保您的机器满足所有 TSA 安装先决条件。

DBT3507E  db2prereqcheck 实用程序未能找到以下程序包或文件:"kernel-source"

Except for the two dependencies "DBT3533I, DBT3507E" in the execution result of this command, the failure to install will not affect the use of DB2. If there are other dependent packages that are not installed successfully, these dependencies need to be installed first.

4.2 Add groups and users

Add user groups db2inst1 and db2fenc1, add users db2iadm1 and db2iadm1, and add the two new users to the corresponding new groups, and finally set passwords for the two new users

[root@node02 server_dec]# groupadd -g 2000 db2iadm1
[root@node02 server_dec]# groupadd -g 2001 db2fadm1
[root@node02 server_dec]# useradd -m -g db2iadm1 -d /home/db2inst1 db2inst1
[root@node02 server_dec]# useradd -m -g db2iadm1 -d /home/db2fenc1 db2fenc1
[root@node02 server_dec]# passwd db2inst1
更改用户 db2inst1 的密码 。
新的 密码:
无效的密码: 密码少于 8 个字符
重新输入新的 密码:
passwd:所有的身份验证令牌已经成功更新。
[root@node02 server_dec]# passwd db2fenc1
更改用户 db2fenc1 的密码 。
新的 密码:
无效的密码: 密码少于 8 个字符
重新输入新的 密码:
passwd:所有的身份验证令牌已经成功更新。
  • db2inst1: instance owner

  • db2fenc1: fenced user

4.3 Create an instance

  • The default service port of db2 is 50000
  • Enter the instance directory under the db2 installation directory
  • Execute db2icrtthe command to create an instance
  • See that The execution completed successfully.the representative instance is successfully created
[root@node02 ~]# cd /opt/ibm/db2/V11.5/instance
[root@node02 instance]# ./db2icrt -p 50000 -u db2fenc1 db2inst1
DBI1446I  The db2icrt command is running.


DB2 installation is being initialized.

 Total number of tasks to be performed: 4
Total estimated time for all tasks to be performed: 309 second(s)

Task #1 start
Description: Setting default global profile registry variables
Estimated time 1 second(s)
Task #1 end

Task #2 start
Description: Initializing instance list
Estimated time 5 second(s)
Task #2 end

Task #3 start
Description: Configuring DB2 instances
Estimated time 300 second(s)
Task #3 end

Task #4 start
Description: Updating global profile registry
Estimated time 3 second(s)
Task #4 end

The execution completed successfully.

For more information see the DB2 installation log at "/tmp/db2icrt.log.55121".
DBI1070I  Program db2icrt completed successfully.

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-ORB9aHnQ-1675307254563)(./2.jpeg)]

4.4 Create an instance library and start the service

Create an instance library

  • switch to db2inst1user
  • Enter the instance directory under the db2 installation directory
  • Execute db2samplthe command to create an instance library
[root@node02 instance]# su - db2inst1
上一次登录:六 1月 14 17:07:30 CST 2023pts/0 上
[db2inst1@node02 ~]$ cd /opt/ibm/db2/V11.5/instance/
[db2inst1@node02 instance]$ db2sampl

Note: db2samplThe command automatically creates a sampledatabase instance named as shown in the image below:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-ZggZAlV9-1675307254565)(3.jpeg)]

As shown in the above figure. It means that the instance library is created successfully

  • start service

db2startThe command is used to start the db2 service

[db2inst1@node02 instance]$ db2start
01/14/2023 17:16:08     0   0   SQL1063N  DB2START processing was successful.
SQL1063N  DB2START processing was successful.

4.5 Connection

  • Enter the interactive environment
[db2inst1@node02 instance]$ db2
(c) Copyright IBM Corporation 1993,2007
Command Line Processor for DB2 Client 11.5.4.0

You can issue database manager commands and SQL statements from the command
prompt. For example:
    db2 => connect to sample
    db2 => bind sample.bnd

For general help, type: ?.
For command help, type: ? command, where command can be
the first few keywords of a database manager command. For example:
 ? CATALOG DATABASE for help on the CATALOG DATABASE command
 ? CATALOG          for help on all of the CATALOG commands.

To exit db2 interactive mode, type QUIT at the command prompt. Outside
interactive mode, all commands must be prefixed with 'db2'.
To list the current command option settings, type LIST COMMAND OPTIONS.

For more detailed help, refer to the Online Reference Manual.

db2 =>
  • Connect to the database instance
db2 => connect to sample

   Database Connection Information

 Database server        = DB2/LINUXX8664 11.5.4.0
 SQL authorization ID   = DB2INST1
 Local database alias   = SAMPLE
 
  • View all tables under the sample library sample

Note: Do not add a semicolon after the sql statement

db2 => list tables

Table/View                      Schema          Type  Creation time
------------------------------- --------------- ----- --------------------------
ACT                             DB2INST1        T     2023-01-14-17.14.29.830759
ADEFUSR                         DB2INST1        S     2023-01-14-17.14.31.218932
CATALOG                         DB2INST1        T     2023-01-14-17.14.33.002045
CL_SCHED                        DB2INST1        T     2023-01-14-17.14.29.299734
CUSTOMER                        DB2INST1        T     2023-01-14-17.14.32.839163
DEPARTMENT                      DB2INST1        T     2023-01-14-17.14.29.340559
DEPT                            DB2INST1        A     2023-01-14-17.14.29.422197
EMP                             DB2INST1        A     2023-01-14-17.14.29.487346
EMPACT                          DB2INST1        A     2023-01-14-17.14.29.829392
EMPLOYEE                        DB2INST1        T     2023-01-14-17.14.29.423121
EMPMDC                          DB2INST1        T     2023-01-14-17.14.31.348910
EMPPROJACT                      DB2INST1        T     2023-01-14-17.14.29.801185
EMP_ACT                         DB2INST1        A     2023-01-14-17.14.29.830162
EMP_PHOTO                       DB2INST1        T     2023-01-14-17.14.29.488199
EMP_RESUME                      DB2INST1        T     2023-01-14-17.14.29.577152
INVENTORY                       DB2INST1        T     2023-01-14-17.14.32.792380
IN_TRAY                         DB2INST1        T     2023-01-14-17.14.29.888552
ORG                             DB2INST1        T     2023-01-14-17.14.29.914447
PRODUCT                         DB2INST1        T     2023-01-14-17.14.32.707200
PRODUCTSUPPLIER                 DB2INST1        T     2023-01-14-17.14.33.135046
PROJ                            DB2INST1        A     2023-01-14-17.14.29.744731
PROJACT                         DB2INST1        T     2023-01-14-17.14.29.746236
PROJECT                         DB2INST1        T     2023-01-14-17.14.29.670584
PURCHASEORDER                   DB2INST1        T     2023-01-14-17.14.32.919101
SALES                           DB2INST1        T     2023-01-14-17.14.29.959681
STAFF                           DB2INST1        T     2023-01-14-17.14.29.936877
STAFFG                          DB2INST1        T     2023-01-14-17.14.31.033939
STUDENT                         DB2INST1        T     2023-01-14-17.19.57.468544
SUPPLIERS                       DB2INST1        T     2023-01-14-17.14.33.069115
VACT                            DB2INST1        V     2023-01-14-17.14.29.999212
VASTRDE1                        DB2INST1        V     2023-01-14-17.14.30.013130
VASTRDE2                        DB2INST1        V     2023-01-14-17.14.30.016328
VDEPMG1                         DB2INST1        V     2023-01-14-17.14.30.006266
VDEPT                           DB2INST1        V     2023-01-14-17.14.29.983567
VEMP                            DB2INST1        V     2023-01-14-17.14.29.992888
VEMPDPT1                        DB2INST1        V     2023-01-14-17.14.30.009309
VEMPLP                          DB2INST1        V     2023-01-14-17.14.30.046463
VEMPPROJACT                     DB2INST1        V     2023-01-14-17.14.30.004078
VFORPLA                         DB2INST1        V     2023-01-14-17.14.30.032327
VHDEPT                          DB2INST1        V     2023-01-14-17.14.29.990353
VPHONE                          DB2INST1        V     2023-01-14-17.14.30.041997
VPROJ                           DB2INST1        V     2023-01-14-17.14.29.996689
VPROJACT                        DB2INST1        V     2023-01-14-17.14.30.001257
VPROJRE1                        DB2INST1        V     2023-01-14-17.14.30.018638
VPSTRDE1                        DB2INST1        V     2023-01-14-17.14.30.023670
VPSTRDE2                        DB2INST1        V     2023-01-14-17.14.30.028481
VSTAFAC1                        DB2INST1        V     2023-01-14-17.14.30.035293
VSTAFAC2                        DB2INST1        V     2023-01-14-17.14.30.038206

  48 record(s) selected.
  • There is an error in adding a semicolon to the sql statement (regardless of case, do not add a semicolon)
db2 => list tables;
SQL0104N  An unexpected token "tables;" was found following "LIST".  Expected
tokens may include:  "ACTIVE".  SQLSTATE=42601
  • Query the data in the table staff under the sample library sample
db2 => select * from staff limit 2;

ID     NAME      DEPT   JOB   YEARS  SALARY    COMM
------ --------- ------ ----- ------ --------- ---------
    10 Sanders       20 Mgr        7  98357.50         -
    20 Pernal        20 Sales      8  78171.25    612.45

  2 record(s) selected.
  • Create table, insert data
db2 => CREATE TABLE STUDENT(ID int ,NAME varchar(20));
DB20000I  The SQL command completed successfully.
db2 => INSERT INTO STUDENT VALUES(11, 'lisi');
DB20000I  The SQL command completed successfully.
db2 => commit;
DB20000I  The SQL command completed successfully.

Table STUDENT data

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-yMcAmygN-1675307254565)(./4.jpeg)]

5. DataX and DB2 import import case

DataX official website does not have DB2-specific reading and writing tutorials, but 通用RDBMS(支持所有关系型数据库)there are some reading and writing tutorials, and DB2 belongs to the general RDBMS, as shown in the following figure:

insert image description here

The official website relational database read-write link address is as follows:

https://github.com/alibaba/DataX/blob/master/rdbmsreader/doc/rdbmsreader.md
https://github.com/alibaba/DataX/blob/master/rdbmswriter/doc/rdbmswriter.md

STUDENT table data under the DB2 SAMPLE database instance

insert image description here

5.1 Register db2 driver

DataX does not have an independent plug-in to support db2 at the moment, you need to use the general rdbmsreader or rdbmswriter.

How rdbmswriter adds new database support:

  • Enter the directory corresponding to rdbmsreader, where DATAXHOME is the main directory of Data X, namely: '{DATAX_HOME} is the main directory of DataX, namely:`DATAXHO M E is D a t a X main directory , namely:{DATAX_HOME}/plugin/reader/rdbmsreader`
  • There is a plugin.json configuration file in the rdbmsreader plug-in directory, register your specific database driver in this file, and put it in the drivers array. The rdbmsreader plug-in will dynamically select the appropriate database driver to connect to the database when the task is executed.
  • Register the db2 driver of the reader
[whybigdata@node01 datax]$ vim /opt/module/datax/plugin/reader/rdbmsreader/plugin.json 
#在 drivers 里添加 db2 的驱动类com.ibm.db2.jcc.DB2Driver
"drivers":["dm.jdbc.driver.DmDriver", "com.sybase.jdbc3.jdbc.SybDriver", "com.edb.Driver","com.ibm.db2.jcc.DB2Driver"]
  • Register the db2 driver for the writer
[whybigdata@node01 datax]$ vim /opt/module/datax/plugin/writer/rdbmswriter/plugin.json 
#在 drivers 里添加 db2 的驱动类com.ibm.db2.jcc.DB2Driver
"drivers":["dm.jdbc.driver.DmDriver", "com.sybase.jdbc3.jdbc.SybDriver", "com.edb.Driver","com.ibm.db2.jcc.DB2Driver"]
  • db2-related dependencies in DataX (the version I use db2jcc4.jaris January 14, 2017)
[whybigdata@node01 libs]$ pwd
/opt/module/datax/plugin/reader/rdbmsreader/libs
[whybigdata@node01 libs]$ ll | grep db2
-rwxr-xr-x 1 whybigdata whybigdata 3528544 1月  14 19:28 db2jcc4.jar

[whybigdata@node01 libs]$ pwd
/opt/module/datax/plugin/writer/rdbmswriter/libs
[whybigdata@node01 libs]$ ll | grep db2
-rwxr-xr-x 1 whybigdata whybigdata 3528544 1月  14 19:37 db2jcc4.jar

db2jcc4.jarNote: If the following case fails to export from DB2, please replace the updated package when the DB2 connection is normal and the json file is correct.

5.2 Import DB2 data into HDFS

Write a configuration file: enter DataX according to the directory

[whybigdata@node01 datax]$ vim job/db2-2-hdfs.json
  • The content of the file is as follows
{
    
    
    "job": {
    
    
        "content": [
            {
    
    
                "reader": {
    
    
                    "name": "rdbmsreader",
                    "parameter": {
    
    
                        "column": [
                            "ID",
                            "NAME"
                        ],
                        "connection": [
                            {
    
    
                                "jdbcUrl": [
                                    "jdbc:db2://node02:50000/SAMPLE"
                                ],
                                "table": [
                                    "STUDENT"
                                ]
                            }
                        ],
                        "username": "db2inst1",
                        "password": "123456"
                    }
                },
                "writer": {
    
    
                    "name": "hdfswriter",
                    "parameter": {
    
    
                        "column": [
                            {
    
    
                                "name": "id",
                                "type": "int"
                            },
                            {
    
    
                                "name": "name",
                                "type": "string"
                            }
                        ],
                        "defaultFS": "hdfs://node01:8020",
                        "fieldDelimiter": "-",
                        "fileName": "db2.txt",
                        "fileType": "text",
                        "path": "/datax-out",
                        "writeMode": "append"
                    }
                }
            }
        ],
        "setting": {
    
    
            "speed": {
    
    
                "channel": "1"
            }
        }
    }
}

implement

[whybigdata@node01 datax]$ bin/datax.py job/db2-2-hdfs.json

Final Results:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-QDwFhe3K-1675307254567)(6.jpeg)]

5.3 Read DB2 data into MySQL

Write a configuration file: enter DataX according to the directory

[whybigdata@node01 datax]$ vim job/db2-2-mysql.json
  • The content of the file is as follows
{
    
    
    "job": {
    
    
        "content": [
            {
    
    
                "reader": {
    
    
                    "name": "rdbmsreader",
                    "parameter": {
    
                            
                        "column": [
                            "ID",
                            "NAME"
                        ],
                        "connection": [
                            {
    
                                    
                                "jdbcUrl": [
                                    "jdbc:db2://node02:50000/SAMPLE"
                                ],
								"table": [
                                    "STUDENT"
                                ]
                            }
                        ],
						"username": "db2inst1",
                        "password": "123456"
                    }
                },
                "writer": {
    
    
					"name": "mysqlwriter",
					"parameter": {
    
    
						"column": ["*"], 
						"connection": [
							{
    
    
								"jdbcUrl": "jdbc:mysql://node01:3306/datax", 
								"table": ["student"]
							}
						],
						"password": "123456", 
						"username": "root", 
						"writeMode": "insert"
					}
				}
            }
        ],
        "setting": {
    
    
            "speed": {
    
    
                "channel": "1"
            }
        }
    }
}

implement

[whybigdata@node01 datax]$ bin/datax.py job/db2-2-mysql.json

Final Results:

  • Before importing MySQL:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-AilMto15-1675307254568)(./7.jpeg)]

  • After importing MySQL:

[External link picture transfer failed, the source site may have an anti-leeching mechanism, it is recommended to save the picture and upload it directly (img-sr2dmh9R-1675307254569)(./8.jpeg)]

Finish!

Guess you like

Origin blog.csdn.net/m0_52735414/article/details/128846776