mysql集群&mycat

MySQL 主从复制的形式

  • 一主一从
  • 主主复制
  • 一主多从
  • 多主一从
  • 级联复制

一主一从

节点规划

主机 ip 角色
yfm01 192.168.199.101 MASTER
yfm02 192.168.199.102 SLAVE
MYSQL 5.7.21
CENTOS CentOS Linux release 7.9.2009 (Core)
查看mysql版本
select version();
查看centos版本
cat /etc/centos-release

192.168.199.101,/etc/my.cnf
[mysqld]
#开启二进制日志
log-bin=mysql-bin 
#设置server-id,建议使用ip最后3位
server-id=101

192.168.199.102,/etc/my.cnf
[mysqld]
#开启中继日志
relay-log=mysql-relay
#设置server-id,建议使用ip最后3位
server-id=102

2台服务器mysql服务重启

在master建立账户并且授权slave
mysql> GRANT REPLICATION SLAVE ON *.* TO 'root'@'192.168.199.102' IDENTIFIED BY '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

查看master状态
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      603 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)


告知从服务器二进制文件名与位置

mysql> CHANGE MASTER TO master_host = '192.168.199.101',master_user = 'root',master_password = '123456',master_log_file = 'mysql-bin.000001',master_log_pos = 603;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

mysql> FLUSH  PRIVILEGES;
Query OK, 0 rows affected (0.00 sec)

开启复制
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

查看主从复制是否配置成功
mysql> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.199.101
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 603
               Relay_Log_File: mysql-relay.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
  
当看到Slave_IO_State:Waiting for master ot send event 、Slave_IO_Running: YES、Slave_SQL_Running: YES才表明状态正常


验证主从复制是否成功
在Master中创建数据库并创建数据表并插入一条数据
mysql> create database test;
Query OK, 1 row affected (0.00 sec)
mysql> use test;
Database changed
mysql> create table tab1(id int auto_increment,name varchar(10),primary key(id));
Query OK, 0 rows affected (0.00 sec)
mysql> insert into tab1(id,name) values (1,'why');
Query OK, 1 row affected (0.03 sec)

在slave中查询这条数据
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.00 sec)

mysql> use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> select * from test;
ERROR 1146 (42S02): Table 'test.test' doesn't exist
mysql> select * from tab1;
+----+------+
| id | name |
+----+------+
|  1 | why  |
+----+------+
1 row in set (0.00 sec)


常用语句
show master status: 查看master的状态,尤其是当前的日志及位置
show slave status 查看slave的状态
reset slave 重置slave状态
start slave 启动slave状态
stop slave 暂停slave状态
mycat读写分离

实现MYSQL读写分离可以基于第三方插件,也可以通过开发修改代码实现,具体实现的读写分离的常见方式有如下四种:

  • Amoeba读写分离;
  • MySQL-Proxy读写分离;
  • Mycat读写分离;
  • 基于程序读写分离(效率很高,实施难度大,开发改代码);

节点规划

主机名 ip 角色
yfm01 192.168.199.101 写数据库
yfm02 192.168.199.102 从数据库
yfm03 192.168.199.103 myacat
下载mycat安装包
mkdir -p /data/mycat
cd /data/mycat && wget http://dl.mycat.org.cn/1.6.7.4/Mycat-server-1.6.7.4-test-20191113141017-linux.tar.gz
tar -zxvf Mycat-server-1.6.7.4-test-20191113141017-linux.tar.gz
[root@yfm03 mycat]# groupadd mycat
[root@yfm03 mycat]# useradd -r -g mycat mycat
[root@yfm03 mycat]# chown -R mycat.mycat /data/mycat/mycat

设置环境变量
echo 'export MYCAT_HOME=/data/mycat/mycat' >> /etc/profile
echo 'export PATH=$MYCAT_HOME/bin:$PATH' >> /etc/profile

source /etc/profile

修改日志级别
vi /data/mycat/mycat/conf/log4j2.xml
添加 <AsyncLogger name="io.mycat" level="debug"...> 这个节点,表示对 io.mycat 包的日志级别为 debug

<Loggers>
    <AsyncLogger name="io.mycat" level="debug" includeLocation="true" additivity="false">
        <!--<AppenderRef ref="Console"/>-->
        <AppenderRef ref="RollingFile"/>
    </AsyncLogger>
    <asyncRoot level="info" includeLocation="true">

        <!--<AppenderRef ref="Console" />-->
        <AppenderRef ref="RollingFile"/>

    </asyncRoot>
</Loggers>

修改server.xml、shcema.xml详细配置在文章末尾

启动mycat
mycat start

检查mycat是否成功启动
ps -ef|grep mycat

查看mycat日志
[root@yfm03 logs]# cat NFO   | jvm 1    | 2021/01/22 10:10:01 | Caused by: io.mycat.config.util.ConfigException: Illegal table conf : table [ TAB1 ] rule function [ rang-long ] partition size : 3 > table datanode size : 1, please make sure table datanode size = function partition size

可以看到mycat启动失败了
修改/data/mycat/mycat/conf/rule.xml
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
	<!-- how many data nodes -->
	<property name="count">1</property>
</function>

修改/data/mycat/mycat/conf/autopartition-long.txt
#默认是三个,我们需要删除最后两个,不然就会报错,说节点少了
#K=1000条记录,M=10000条记录,那么下面三个配置就是0~500万的记录会存在数据库节点1的表中,500万~1000万会存在节点2的表中
# range start-end ,data node index
# K=1000,M=10000.
0-500M=0
# 500M-1000M=1
# 1000M-1500M=2

ERROR  | wrapper  | 2021/01/22 10:38:40 | Startup failed: Timed out waiting for a signal from the JVM.
ERROR  | wrapper  | 2021/01/22 10:38:40 | JVM did not exit on request, terminated
修改/data/mycat/mycat/conf/wrapper.cof
添加wrapper.startup.timeout=300

测试
在yfm01上连接mycat
mysql -uroot -p123456 -P8066 -h192.168.199.103

测试读:初始时master、slave,test库中tab1表有一条why的数据,直接在slave库中将该只修改为why3
[root@yfm02 ~]# mysql -umycat -P8066 -h192.168.199.103 -p
Enter password: 
mysql> select * from tab1;
+----+------+
| id | name |
+----+------+
|  1 | why3 |
+----+------+
1 row in set (0.06 sec)
可以看到通过mycat查询出来的值为why3,主库此时值为why
查看mycat日志cat /data/mycat/mycat/logs/mycat.log 
2021-01-22 15:50:12.399 DEBUG [$_NIOREACTOR-0-RW] (io.mycat.server.NonBlockingSession.releaseConnection(NonBlockingSession.java:386)) - release connection MySQLConnection@567426715 [id=12, lastTime=1611301812387, user=root, schema=test, old shema=test, borrowed=true, fromSlaveDB=true, threadId=1390, charset=utf8, txIsolation=3, autocommit=true, attachment=dn1{SELECT *
FROM tab1
LIMIT 100}, respHandler=SingleNodeHandler [node=dn1{SELECT *
FROM tab1
LIMIT 100}, packetId=6], host=192.168.199.102, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=false]
读操作都被分派到readhost



测试写:插入一条数据看从库是否同步过去
查看mycat日志cat /data/mycat/mycat/logs/mycat.log 
2021-01-22 15:48:00.593 DEBUG [$_NIOREACTOR-0-RW] (io.mycat.backend.datasource.PhysicalDBNode.getConnection(PhysicalDBNode.java:133)) - rrs.getRunOnSlave()  default 
2021-01-22 15:48:00.601 DEBUG [$_NIOREACTOR-0-RW] (io.mycat.server.NonBlockingSession.releaseConnection(NonBlockingSession.java:386)) - release connection MySQLConnection@1449597107 [id=7, lastTime=1611301680585, user=root, schema=test, old shema=test, borrowed=true, fromSlaveDB=false, threadId=1580, charset=utf8, txIsolation=3, autocommit=true, attachment=dn1{INSERT INTO tab1 (id, name)
VALUES ('2', 'yfm02')}, respHandler=SingleNodeHandler [node=dn1{INSERT INTO tab1 (id, name)
VALUES ('2', 'yfm02')}, packetId=1], host=192.168.199.101, port=3306, statusSync=null, writeQueue=0, modifiedSQLExecuted=true]
可以看到写入操作执行在master,且slave也已经正常同步

server.xml

<?xml version="1.0" encoding="UTF-8"?>
<!-- - - Licensed under the Apache License, Version 2.0 (the "License"); 
	- you may not use this file except in compliance with the License. - You 
	may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 
	- - Unless required by applicable law or agreed to in writing, software - 
	distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT 
	WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the 
	License for the specific language governing permissions and - limitations 
	under the License. -->
<!DOCTYPE mycat:server SYSTEM "server.dtd">
<mycat:server xmlns:mycat="http://io.mycat/">
	<system>
	<property name="nonePasswordLogin">0</property> <!-- 0为需要密码登陆、1为不需要密码登陆 ,默认为0,设置为1则需要指定默认账户-->
	<property name="useHandshakeV10">1</property>
	<property name="useSqlStat">0</property>  <!-- 1为开启实时统计、0为关闭 -->
	<property name="useGlobleTableCheck">0</property>  <!-- 1为开启全加班一致性检测、0为关闭 -->
		<property name="sqlExecuteTimeout">300</property>  <!-- SQL 执行超时 单位:秒-->
		<property name="sequnceHandlerType">5</property>
		<!--<property name="sequnceHandlerPattern">(?:(\s*next\s+value\s+for\s*MYCATSEQ_(\w+))(,|\)|\s)*)+</property>-->
		<!--必须带有MYCATSEQ_或者 mycatseq_进入序列匹配流程 注意MYCATSEQ_有空格的情况-->
		<property name="sequnceHandlerPattern">(?:(\s*next\s+value\s+for\s*MYCATSEQ_(\w+))(,|\)|\s)*)+</property>
	<property name="subqueryRelationshipCheck">false</property> <!-- 子查询中存在关联查询的情况下,检查关联字段中是否有分片字段 .默认 false -->
	<property name="sequenceHanlderClass">io.mycat.route.sequence.handler.HttpIncrSequenceHandler</property>
      <!--  <property name="useCompression">1</property>--> <!--1为开启mysql压缩协议-->
        <!--  <property name="fakeMySQLVersion">5.6.20</property>--> <!--设置模拟的MySQL版本号-->
	<!-- <property name="processorBufferChunk">40960</property> -->
	<!-- 
	<property name="processors">1</property> 
	<property name="processorExecutor">32</property> 
	 -->
        <!--默认为type 0: DirectByteBufferPool | type 1 ByteBufferArena | type 2 NettyBufferPool -->
		<property name="processorBufferPoolType">0</property>
		<!--默认是65535 64K 用于sql解析时最大文本长度 -->
		<!--<property name="maxStringLiteralLength">65535</property>-->
		<!--<property name="sequnceHandlerType">0</property>-->
		<!--<property name="backSocketNoDelay">1</property>-->
		<!--<property name="frontSocketNoDelay">1</property>-->
		<!--<property name="processorExecutor">16</property>-->
		<!--
			<property name="serverPort">8066</property> <property name="managerPort">9066</property> 
			<property name="idleTimeout">300000</property> <property name="bindIp">0.0.0.0</property>
			<property name="dataNodeIdleCheckPeriod">300000</property> 5 * 60 * 1000L; //连接空闲检查
			<property name="frontWriteQueueSize">4096</property> <property name="processors">32</property> -->
		<!--分布式事务开关,0为不过滤分布式事务,1为过滤分布式事务(如果分布式事务内只涉及全局表,则不过滤),2为不过滤分布式事务,但是记录分布式事务日志-->
		<property name="handleDistributedTransactions">0</property>
		
			<!--
			off heap for merge/order/group/limit      1开启   0关闭
		-->
		<property name="useOffHeapForMerge">0</property>

		<!--
			单位为m
		-->
        <property name="memoryPageSize">64k</property>

		<!--
			单位为k
		-->
		<property name="spillsFileBufferSize">1k</property>

		<property name="useStreamOutput">0</property>

		<!--
			单位为m
		-->
		<property name="systemReserveMemorySize">384m</property>


		<!--是否采用zookeeper协调切换  -->
		<property name="useZKSwitch">false</property>

		<!-- XA Recovery Log日志路径 -->
		<!--<property name="XARecoveryLogBaseDir">./</property>-->

		<!-- XA Recovery Log日志名称 -->
		<!--<property name="XARecoveryLogBaseName">tmlog</property>-->
		<!--如果为 true的话 严格遵守隔离级别,不会在仅仅只有select语句的时候在事务中切换连接-->
		<property name="strictTxIsolation">false</property>
		
		<property name="useZKSwitch">true</property>
		
	</system>
	
	<!-- 全局SQL防火墙设置 -->
	<!--白名单可以使用通配符%或着*-->
	<!--例如<host host="127.0.0.*" user="root"/>-->
	<!--例如<host host="127.0.*" user="root"/>-->
	<!--例如<host host="127.*" user="root"/>-->
	<!--例如<host host="1*7.*" user="root"/>-->
	<!--这些配置情况下对于127.0.0.1都能以root账户登录-->
	<!--
	<firewall>
	   <whitehost>
	      <host host="1*7.0.0.*" user="root"/>
	   </whitehost>
       <blacklist check="false">
       </blacklist>
	</firewall>
	-->

	<user name="mycat" defaultAccount="true">
		<property name="password">mycat</property>
		<property name="schemas">testDB</property>
		<property name="defaultSchema">testDB</property>
		<!--No MyCAT Database selected 错误前会尝试使用该schema作为schema,不设置则为null,报错 -->
		
		<!-- 表级 DML 权限设置 -->
		<!-- 		
		<privileges check="false">
			<schema name="TESTDB" dml="0110" >
				<table name="tb01" dml="0000"></table>
				<table name="tb02" dml="1111"></table>
			</schema>
		</privileges>		
		 -->
	</user>
	
	<user name="user">
		<property name="password">user</property>
		<property name="schemas">testDB</property>
		<property name="readOnly">true</property>
		<property name="defaultSchema">testDB</property>
	</user>

</mycat:server>

schema.xml

<?xml version="1.0"?>
<!DOCTYPE mycat:schema SYSTEM "schema.dtd">
<mycat:schema xmlns:mycat="http://io.mycat/">

	<schema name="testDB" checkSQLschema="true" sqlMaxLimit="100">
		<!-- auto sharding by id (long) -->
		<!--splitTableNames 启用<table name 属性使用逗号分割配置多个表-->
		<table name="tab1" dataNode="dn1" rule="auto-sharding-long" splitTableNames ="true"/>
		<!-- <table name="oc_call" primaryKey="ID" dataNode="dn1$0-743" rule="latest-month-calldate"
			/> -->
	</schema>
	<!-- <dataNode name="dn1$0-743" dataHost="localhost1" database="db$0-743"
		/> -->
	<dataNode name="dn1" dataHost="localhost1" database="test" />
	<!--<dataNode name="dn4" dataHost="sequoiadb1" database="SAMPLE" />
	 <dataNode name="jdbc_dn1" dataHost="jdbchost" database="db1" />
	<dataNode	name="jdbc_dn2" dataHost="jdbchost" database="db2" />
	<dataNode name="jdbc_dn3" 	dataHost="jdbchost" database="db3" /> -->
	<dataHost name="localhost1" maxCon="1000" minCon="10" balance="1"
			  writeType="0" dbType="mysql" dbDriver="native" switchType="1"  slaveThreshold="100">
		<heartbeat>select user()</heartbeat>
		<!-- can have multi write hosts -->
		<writeHost host="hostM1" url="192.168.199.101:3306" user="root"
				   password="123456">
				   <readHost host="hostS1" url="192.168.199.102:3306" user="root" password="123456" />
		</writeHost>
		<!-- <writeHost host="hostM2" url="localhost:3316" user="root" password="123456"/> -->
	</dataHost>
	<!--
		<dataHost name="sequoiadb1" maxCon="1000" minCon="1" balance="0" dbType="sequoiadb" dbDriver="jdbc">
		<heartbeat> 		</heartbeat>
		 <writeHost host="hostM1" url="sequoiadb://1426587161.dbaas.sequoialab.net:11920/SAMPLE" user="jifeng" 	password="jifeng"></writeHost>
		 </dataHost>

	  <dataHost name="oracle1" maxCon="1000" minCon="1" balance="0" writeType="0" 	dbType="oracle" dbDriver="jdbc"> <heartbeat>select 1 from dual</heartbeat>
		<connectionInitSql>alter session set nls_date_format='yyyy-mm-dd hh24:mi:ss'</connectionInitSql>
		<writeHost host="hostM1" url="jdbc:oracle:thin:@127.0.0.1:1521:nange" user="base" 	password="123456" > </writeHost> </dataHost>

		<dataHost name="jdbchost" maxCon="1000" 	minCon="1" balance="0" writeType="0" dbType="mongodb" dbDriver="jdbc">
		<heartbeat>select 	user()</heartbeat>
		<writeHost host="hostM" url="mongodb://192.168.0.99/test" user="admin" password="123456" ></writeHost> </dataHost>

		<dataHost name="sparksql" maxCon="1000" minCon="1" balance="0" dbType="spark" dbDriver="jdbc">
		<heartbeat> </heartbeat>
		 <writeHost host="hostM1" url="jdbc:hive2://feng01:10000" user="jifeng" 	password="jifeng"></writeHost> </dataHost> -->

	<!-- <dataHost name="jdbchost" maxCon="1000" minCon="10" balance="0" dbType="mysql"
		dbDriver="jdbc"> <heartbeat>select user()</heartbeat> <writeHost host="hostM1"
		url="jdbc:mysql://localhost:3306" user="root" password="123456"> </writeHost>
		</dataHost> -->
</mycat:schema>

schema配置解析:

writeType:负载均衡类型,目前的取值有 3 种
writeType="0", 所有写操作发送到配置的第一个 writeHost,第一个挂了切到还生存的第二个writeHost,重新启动后已切换后的为准,切换记录在配置文件中:dnindex.properties .
writeType="1",所有写操作都随机的发送到配置的 writeHost。
writeType=“2",没实现。

balance:属性负载均衡类型,目前的取值有 4 种:
balance="0", 不开启读写分离机制,所有读操作都发送到当前可用的writeHost 上。
balance="1",全部的 readHost 与 stand by writeHost 参与 select 语句的负载均衡,简单的说,当双主双从模式(M1 ->S1 , M2->S2,并且 M1 与 M2 互为主备),正常情况下, M2,S1,S2 都参与 select 语句的负载均衡。
balance="2",所有读操作都随机的在 writeHost、 readhost 上分发。
balance="3", 所有读请求随机的分发到 wiriterHost 对应的 readhost 执行,writerHost 不负担读压力,如果从服务器都挂了,过一段延迟之后,自动读主服务器。注意 balance=3 只在 1.4 及其以后版本有, 1.3 没有。

switchType
-1 表示不自动切换
1 默认值,自动切换。
2 基于MySQL 主从同步的状态决定是否切换 心跳语句为 show slave status
3 基于MySQLgalarycluster的切换机制(适合集群)(1.4.1)心跳语句为 show status like ‘wsrep%’

总结:
常见坑位1:此配置在主机宕机后,msyql数据库就瘫痪了。
解决:可以把其中一台从机分配成writeHost,再第一个writeHost(主机)宕机后自动切换至第二个writeHost。

常见坑位2:此时出现了新的问题,读数据可以,写数据也可以。在把第一个wirteHost启动后,新添加的数据并没有保存到主机中而是报存到了第二个writeHost(从机中)。
解决:官网表示讨论了很久,建议重新手工配置主从复制关系。原从配成现主,原主作为现从加入到集群中。

参考:https://blog.csdn.net/zh15732621679/article/details/78816338
这里配置了互为主从
主主复制

节点规划

主机 ip 角色
yfm01 192.168.199.101 MASTER、SLAVE
yfm02 192.168.199.102 SLAVE、MASTER
MySQL 5.7.21
CentOS CentOS Linux release 7.9.2009 (Core)
192.168.199.101,/etc/my.cnf
[mysqld]
server-id=101
log-bin=mysql-bin
auto-increment-increment=2
auto-increment-offset=1
log-slave-updates

192.168.199.102,/etc/my.cnf
[mysqld]
server-id=102
log-bin=mysql-bin
auto-increment-increment=2
auto-increment-offset=2
log-slave-updates


2台服务器mysql服务重启

添加主从同步账户
在yfm01:
mysql> GRANT REPLICATION SLAVE ON *.* TO 'root'@'192.168.199.102' IDENTIFIED BY '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

在yfm02:
mysql> GRANT REPLICATION SLAVE ON *.* TO 'root'@'192.168.199.101' IDENTIFIED BY '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

查看master状态
在yfm01:
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000003 |      154 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

在yfm02:
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000001 |      624 |              |                  |                   |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

配置同步信息:
在yfm01上:
mysql> change master to master_host='192.168.199.102',master_port=3306,master_user='root',master_password='123456',master_log_file='mysql-bin.000001',master_log_pos=624;
Query OK, 0 rows affected, 2 warnings (0.01 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.199.102
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 624
               Relay_Log_File: yfm01-relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
              
当看到Slave_IO_State:Waiting for master ot send event 、Slave_IO_Running: YES、Slave_SQL_Running: YES才表明状态正常

在yfm02上:
mysql> change master to master_host='192.168.199.101',master_port=3306,master_user='root',master_password='123456',master_log_file='mysql-bin.000003',master_log_pos=154;
出现报错start slave;ERROR 1872 (HY000): Slave failed to initialize relay log info structure from the repository

查看mysql日志
2021-01-21T22:15:40.930581Z 0 [ERROR] Failed to open the relay log './mysql-relay.000005' (relay_log_pos 367).
2021-01-21T22:15:40.930587Z 0 [ERROR] Could not find target log file mentioned in relay log info in the index file './yfm02-relay-bin.index' during relay log initialization.
2021-01-21T22:15:40.930715Z 0 [ERROR] Slave: Failed to initialize the master info structure for channel ''; its record may still be present in 'mysql.slave_master_info' table, consider deleting it.

通过查找资料需要执行reset slave;

mysql> reset slave;
Query OK, 0 rows affected (0.01 sec)

mysql> change master to master_host='192.168.199.101',master_port=3306,master_user='root',master_password='123456',master_log_file='mysql-bin.000003',master_log_pos=154;
Query OK, 0 rows affected, 2 warnings (0.01 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.199.101
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 154
               Relay_Log_File: yfm02-relay-bin.000002
                Relay_Log_Pos: 320
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 

验证测试
分别在yfm01、yfm02上新增或者修改数据看能否同步即可,这里省略了

常用语句
show master status: 查看master的状态,尤其是当前的日志及位置
show slave status 查看slave的状态
reset slave 重置slave状态
start slave 启动slave状态
stop slave 暂停slave状态

MySQL的高可用方案一般有如下几种:

keepalived+双主,MHA,PXC,MMM,Heartbeat+DRBD等,比较常用的是keepalived+双主,MHA和PXC。

keepalived主从 + mysql双主

节点规划

主机 ip 角色
yfm01 192.168.199.101 MASTER
yfm02 192.168.199.102 BACKUP
keepalived keepalived-1.3.5-19.el7.x86_64
MySQL 5.7.21
CentOS CentOS Linux release 7.9.2009 (Core)

查看keepalived版本 rpm -qa|grep keepalived

keepalived和mysql的安装参考之前的文章

yfm01,/etc/keepalived/keepalived.conf(下面配置中没有使用lvs的负载均衡功能,所以不需要配置虚拟服务器virtual server)

! Configuration File for keepalived
       
global_defs {
router_id MASTER
}
       
vrrp_script chk_mysql_port {     #检测mysql服务是否在运行。有很多方式,比如进程,用脚本检测等等
    script "/etc/keepalived/chk_mysql.sh"   #这里通过脚本监测
    interval 2                   #脚本执行间隔,每2s检测一次
    weight -5                    #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
    fall 2                    #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
    rise 1                    #检测1次成功就算成功。但不修改优先级
}
       
vrrp_instance VI_1 {
    state MASTER    
    interface ens33      #指定虚拟ip的网卡接口
    virtual_router_id 51    #路由器标识,MASTER和BACKUP必须是一致的
    priority 101            #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级。这样MASTER故障恢复后,就可以将VIP资源再次抢回来 
    advert_int 1         
    authentication {   
        auth_type PASS 
        auth_pass 1111     
    }
    virtual_ipaddress {    
        192.168.199.99
    }
      
track_script {               
   chk_mysql_port             
}
}

yfm02,/etc/keepalived/keepalived.conf(nopreempt不设置)

! Configuration File for keepalived
       
global_defs {
router_id BACKUP
}
       
vrrp_script chk_mysql_port {     #检测mysql服务是否在运行。有很多方式,比如进程,用脚本检测等等
    script "/etc/keepalived/chk_mysql.sh"   #这里通过脚本监测
    interval 2                   #脚本执行间隔,每2s检测一次
    weight -5                    #脚本结果导致的优先级变更,检测失败(脚本返回非0)则优先级 -5
    fall 2                    #检测连续2次失败才算确定是真失败。会用weight减少优先级(1-255之间)
    rise 1                    #检测1次成功就算成功。但不修改优先级
}
       
vrrp_instance VI_1 {
    state BACKUP    
    interface ens33      #指定虚拟ip的网卡接口
    virtual_router_id 51    #路由器标识,MASTER和BACKUP必须是一致的
    priority 101            #定义优先级,数字越大,优先级越高,在同一个vrrp_instance下,MASTER的优先级必须大于BACKUP的优先级。这样MASTER故障恢复后,就可以将VIP资源再次抢回来 
    advert_int 1         
    authentication {   
        auth_type PASS 
        auth_pass 1111     
    }
    virtual_ipaddress {    
        192.168.199.99
    }
      
track_script {               
   chk_mysql_port             
}
}

/etc/keepalived/chk_mysql.sh

#!/bin/bash
counter=$(netstat -na|grep "LISTEN"|grep "3306"|wc -l)
if [ "${counter}" -eq 0 ]; then
    /etc/init.d/keepalived stop
fi

chmod +x /etc/keepalived/chk_mysql.sh

验证测试
通过Mysql客户端通过VIP连接,看是否连接成功
[root@yfm01 data]# mysql -h192.168.199.99 -uroot -p
Enter password: 
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| test               |
+--------------------+
5 rows in set (0.01 sec)

mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| tab1           |
+----------------+
1 row in set (0.00 sec)

mysql> select * from tab1;
+----+-------+
| id | name  |
+----+-------+
|  1 | why   |
|  2 | yfm02 |
|  3 | yfm04 |
|  5 | yfm05 |
+----+-------+
4 rows in set (0.00 sec)

默认情况下,vip是在master1上的。使用"ip addr"命令查看vip切换情况 
[root@yfm01 data]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:7b:30:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.199.101/24 brd 192.168.199.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.199.99/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
    inet6 fe80::fca9:7d90:fd6:d63a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
    inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever

停止master1机器上的mysql服务,根据配置中的脚本,mysql服务停了,keepalived也会停,从而vip资源将会切换到master2机器上。(mysql服务没有起来的时候,keepalived服务也无法顺利启动!)
[root@yfm01 keepalived]# service mysql stop
Shutting down MySQL........... SUCCESS! 
[root@yfm01 keepalived]# ps -ef|grep keepalived
root       8967   6853  0 21:14 pts/2    00:00:00 grep --color=auto keepalived
[root@yfm01 keepalived]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:7b:30:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.199.101/24 brd 192.168.199.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
    inet6 fe80::fca9:7d90:fd6:d63a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
    inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
如上结果,发现32位子网掩码的vip没有了,说明此时vip资源已不在master1机器上了

再次启动master1的mysql和keepalived服务,会发现vip资源再次从yfm022机器上切换回来了
[root@yfm01 keepalived]# service mysql start
Starting MySQL. SUCCESS! 
[root@yfm01 keepalived]# systemctl start keepalived
[root@yfm01 keepalived]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:7b:30:82 brd ff:ff:ff:ff:ff:ff
    inet 192.168.199.101/24 brd 192.168.199.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.199.99/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
    inet6 fe80::fca9:7d90:fd6:d63a/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
    inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever

优化chk_mysql.sh

上面的mysql监测脚本有点过于简单且粗暴,即脚本一旦监测到Master的mysql服务关闭,就立刻把keepalived服务关闭,从而实现vip转移!
 
下面对该脚本进行优化,优化后,当监测到Master的mysql服务关闭后,就会将vip切换到Backup上(但此时Master的keepalived服务不会被暴力kill)
当Master的mysql服务恢复后,就会再次将VIP资源切回来!

#!/bin/bash
MYSQL=/usr/local/mysql/bin/mysql
MYSQL_HOST=localhost
MYSQL_USER=root
MYSQL_PASSWORD=123456
CHECK_TIME=3
  
#mysql  is working MYSQL_OK is 1 , mysql down MYSQL_OK is 0
  
MYSQL_OK=1
  
function check_mysql_helth (){
    $MYSQL -h $MYSQL_HOST -u $MYSQL_USER -p${MYSQL_PASSWORD} -e "show status;" >/dev/null 2>&1
    if [ $? = 0 ] ;then
    MYSQL_OK=1
    else
    MYSQL_OK=0
    fi
    return $MYSQL_OK
}
while [ $CHECK_TIME -ne 0 ]
do
    let "CHECK_TIME -= 1"
    check_mysql_helth
if [ $MYSQL_OK = 1 ] ; then
    CHECK_TIME=0
    exit 0
fi
if [ $MYSQL_OK -eq 0 ] &&  [ $CHECK_TIME -eq 0 ]
then
    pkill keepalived
    exit 1
fi
sleep 1
done

问题:

问题1:keepalived切换时长:如果关闭keepalived,vip切换在3s以内;如果通过判断优先级,大概在7,8秒(检测脚本时长+确认收到高优先级时长+发送arp通告时长)。
问题2:keepalived切换主从延迟及中断是否需要判断?
问题3:双主单活的读写权限是否要控制下,切换为standby只有读权限?
MHA

节点规划

主机 ip 安装软件 角色
yfm01 192.168.199.101 MySQL、MHA-NODE、MHA-MANAGER MASTER
yfm02 192.168.199.102 MySQL、MHA-NODE SLAVE1
yfm04 192.168.199.104 MySQL、MHA-NODE SLAVE2
VIP 192.168.199.99
MySQL 5.7.21
CentOS CentOS Linux release 7.9.2009 (Core)

需要现部署一套一主两从的数据库,可以参考之前的文章

因为之前搭建的yfm01、yfm02主从环境,后面又改成了yfm01、yfm02主主环境,当想把它们重新改为主从环境时,在修改完/etc/my.cnf后,执行主从同步的过程中,主从同步失败了;
这里演示下同步失败后的解决办法

备份yfm01数据
mysqldump -uroot -p --all-databases > mysql.bak.sql

将mysql.bak.sql拷贝到yfm02
scp [email protected]:/data/mysql/mysql.bak.sql .

停止yfm02从库同步状态
stop slave;

导入数据备份
source /data/mysql/mysql.bak.sql

设置从库同步
CHANGE MASTER TO master_host = '192.168.199.101',master_user = 'root',master_password = '123456',master_log_file = 'mysql-bin.000008',master_log_pos = 970;

重新开启同步
start slave;

半同步配置

yfm01
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so';
Query OK, 0 rows affected (0.00 sec)

/etc/my.cnf增加如下配置
[mysqld]
rpl_semi_sync_master_enabled = 1 
rpl_semi_sync_master_timeout = 1000 # 1 second


yfm02、yfm04
mysql> install plugin rpl_semi_sync_slave soname 'semisync_slave.so';
Query OK, 0 rows affected (0.00 sec)
mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so';
Query OK, 0 rows affected (0.01 sec)

查看是否成功加载
mysql> show plugins;
+----------------------------+----------+--------------------+--------------------+---------+
| Name                       | Status   | Type               | Library            | License |
+----------------------------+----------+--------------------+--------------------
| rpl_semi_sync_slave        | ACTIVE   | REPLICATION        | semisync_slave.so  | GPL     |
| rpl_semi_sync_master       | ACTIVE   | REPLICATION        | semisync_master.so | GPL     |
+----------------------------+----------+--------------------+--------------------+---------+
46 rows in set (0.00 sec)

/etc/my.cnf增加如下配置
[mysqld]
rpl_semi_sync_slave_enabled = 1

然后重启数据库

重启yfm02、yfm03上的IO线程
mysql> STOP SLAVE IO_THREAD;
Query OK, 0 rows affected (0.00 sec)
mysql> START SLAVE IO_THREAD;
Query OK, 0 rows affected (0.00 sec)
如果没有重启,则默认还是异步复制,重启后,slave会在master上注册为半同步复制的slave角色。

这时候yfm01,/var/log/mysql.log会显示如下日志
2021-01-27T03:43:55.555691Z 7 [Note] While initializing dump thread for slave with UUID <c964d57e-5f25-11eb-b6b9-000c29462169>, found a zombie dump thread with the same UUID. Master is killing the zombie dump thread(3).
2021-01-27T03:43:55.555740Z 7 [Note] Start binlog_dump to master_thread_id(7) slave_server(104), pos(mysql-bin.000002, 362)
2021-01-27T03:43:55.555747Z 7 [Note] Start semi-sync binlog_dump to slave (server_id: 104), pos(mysql-bin.000002, 362)
2021-01-27T03:43:55.555802Z 3 [Note] Stop semi-sync binlog_dump to slave (server_id: 104)

查看半同步是否在运行
MASTER
mysql> show status like 'Rpl_semi_sync_master_status';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| Rpl_semi_sync_master_status | ON    |
+-----------------------------+-------+
1 row in set (0.00 sec)

SLAVE
mysql> show status like 'Rpl_semi_sync_slave_status';
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Rpl_semi_sync_slave_status | ON    |
+----------------------------+-------+
1 row in set (0.00 sec)

yfm01,/etc/my.cnf

# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

#开启二进制日志
log-bin=mysql-bin 
#设置server-id,建议使用ip最后3位
server-id=101
log_slave_updates = 1          #  从库通过binlog更新的数据写进从库二进制日志中,必加,否则切换后可能丢失数据
skip-name-resolve              #  建议加上 非必须项

plugin-load=rpl_semi_sync_master=semisync_master.so
rpl_semi_sync_master_enabled = 1 
rpl_semi_sync_master_timeout = 1000 # 1 second

# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

yfm02,/etc/my.cnf

# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

#开启中继日志
relay-log=mysql-relay
#设置server-id,建议使用ip最后3位
server-id=102
relay_log_purge = 0            #  关闭自动清理中继日志
log_slave_updates = 1          #  从库通过binlog更新的数据写进从库二进制日志中,必加,否则切换后可能丢失数据
skip-name-resolve              #  建议加上 非必须项

plugin-load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
rpl-semi-sync-master-enabled = 1
rpl-semi-sync-slave-enabled = 1

# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

配置SSH免登录

ssh-keygen #生成秘钥文件
ssh-copy-id [email protected] #将本机公钥复制到远程机器公钥

关闭“ARP查询”功能,不然ping本机ip很慢
echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce
echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore
echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce
/sbin/sysctl -p >/dev/null 2>&1

不然ssh [email protected]很慢,这也可以解决上面的问题
vi /etc/hosts
192.168.199.101 127.0.0.1

安装MHA依赖

yum install  perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-CPAN perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes perl-DBI mysql-libs perl-Email-Date-Format perl-File-Remove perl-Mail-Sender perl-Mail-Sendmail perl-MIME-Lite perl-MIME-Types perl-Module-Install perl-Module-ScanDeps perl-YAML -y

下载压缩包

mkdir -p /data/mha && cd /data/mha
wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58.tar.gz
wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58.tar.gz

全部节点部署mha4mysql-node

tar -zxvf mha4mysql-node-0.58.tar.gz
cd mha4mysql-node-0.58/
perl Makefile.PL
make && make install
#执行完成后会得到四个工具
[root@yfm01 mha4mysql-node-0.58]# ls -al /usr/local/bin/
total 13888
-r-xr-xr-x   1 root root   17639 Jan 26 04:00 apply_diff_relay_logs
-r-xr-xr-x   1 root root    4807 Jan 26 04:00 filter_mysqlbinlog
-r-xr-xr-x   1 root root    8337 Jan 26 04:00 purge_relay_logs
-r-xr-xr-x   1 root root    7525 Jan 26 04:00 save_binary_logs

yfm01节点部署mha4mysql-manager

tar -zxvf mha4mysql-manager-0.58.tar.gz
cd mha4mysql-manager-0.58/
perl Makefile.PL
make && make install
#执行完成后会得到manager工具
[root@yfm01 mha4mysql-manager-0.58]# ls -al /usr/local/bin/ | grep masterha
-r-xr-xr-x   1 root root    1995 Jan 26 04:04 masterha_check_repl
-r-xr-xr-x   1 root root    1779 Jan 26 04:04 masterha_check_ssh
-r-xr-xr-x   1 root root    1865 Jan 26 04:04 masterha_check_status
-r-xr-xr-x   1 root root    3201 Jan 26 04:04 masterha_conf_host
-r-xr-xr-x   1 root root    2517 Jan 26 04:04 masterha_manager
-r-xr-xr-x   1 root root    2165 Jan 26 04:04 masterha_master_monitor
-r-xr-xr-x   1 root root    2373 Jan 26 04:04 masterha_master_switch
-r-xr-xr-x   1 root root    5172 Jan 26 04:04 masterha_secondary_check
-r-xr-xr-x   1 root root    1739 Jan 26 04:04 masterha_stop
yfm01节点配置mha

编写配置文件

mkdir /usr/local/mha && cd /usr/local/mha
vi /usr/local/mha/mha.cnf

[server default]
#设置日志位置
manager_log=/usr/local/mha/manager.log
#设置工作目录
manager_workdir=/usr/local/mha
#设置ip自动飘移脚本路径
master_ip_failover_script="/usr/local/mha/scripts/master_ip_failover"
#设置ip手动飘移脚本路径
master_ip_online_change_script="/usr/local/mha/scripts/master_ip_online_change"
#设置监控mysql的用户
user=root
#设置监控用户对应密码
password=123456
#设置ping包发送时间间隔
ping_interval=1
#设置远程mysql切换时binlog保存路径
remote_workdir=/usr/local/mha
#设置主从复制用户
repl_user=root
#主从复制用户密码
repl_password=123456
#发生漂移时报警脚本路径
report_script=""
#检查MySQL主服务器的可用性
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.199.101 -s 192.168.199.102 -s 192.168.199.104
#发生故障后关闭故障主机的脚本路径
shutdown_script=""
#ssh的用户
ssh_user=root

[server1]
hostname=192.168.199.101
port=3306

[server2]
hostname=192.168.199.102
port=3306

[server3]
hostname=192.168.199.104
port=3306
#设置候选master
candidate_master=1
#设置当slave的relay_logs落后master达到100M时不会选择此slave作为新的master
check_repl_delay=0

编写飘移脚本

mkdir /usr/local/mha/scripts && cd /usr/local/mha/scripts
vi /usr/local/mha/scripts/master_ip_failover

#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
    $command,          $ssh_user,        $orig_master_host, $orig_master_ip,
    $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port
);

my $vip = '192.168.199.99/24';  # Virtual IP
my $key = "1";
my $int = "ens33";
my $ssh_start_vip = "/sbin/ifconfig $int:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig $int:$key down";
my $arp_effect = "/sbin/arping -Uq -s192.168.199.99 -I $int 192.168.199.2 -c 3";    # Virtual IP and gatway

GetOptions(
    'command=s'          => \$command,
    'ssh_user=s'         => \$ssh_user,
    'orig_master_host=s' => \$orig_master_host,
    'orig_master_ip=s'   => \$orig_master_ip,
    'orig_master_port=i' => \$orig_master_port,
    'new_master_host=s'  => \$new_master_host,
    'new_master_ip=s'    => \$new_master_ip,
    'new_master_port=i'  => \$new_master_port,
);

exit &main();

sub main {

    print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

    if ( $command eq "stop" || $command eq "stopssh" ) {

        # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
        # If you manage master ip address at global catalog database,
        # invalidate orig_master_ip here.
        my $exit_code = 1;
        eval {
            print "Disabling the VIP on old master: $orig_master_host \n";
            &stop_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn "Got Error: $@\n";
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "start" ) {

        # all arguments are passed.
        # If you manage master ip address at global catalog database,
        # activate new_master_ip here.
        # You can also grant write access (create user, set read_only=0, etc) here.
        my $exit_code = 10;
        eval {
            print "Enabling the VIP - $vip on the new master - $new_master_host \n";
            &start_vip();
            $exit_code = 0;
        };
        if ($@) {
            warn $@;
            exit $exit_code;
        }
        exit $exit_code;
    }
    elsif ( $command eq "status" ) {
        print "Checking the Status of the script.. OK \n";
        #`ssh $ssh_user\@cluster1 \" $ssh_start_vip \"`;
        exit 0;
    }
    else {
        &usage();
        exit 1;
    }
}

sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
     return 0  unless  ($ssh_user);
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
    print
    "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

编写手动ip飘移脚本master_ip_online_change

vi /usr/local/mha/scripts/master_ip_online_change

#!/bin/bash
source /root/.bash_profile

vip=`echo '192.168.199.99/24'`  #设置VIP
key=`echo '1'`

command=`echo "$1" | awk -F = '{print $2}'`
orig_master_host=`echo "$2" | awk -F = '{print $2}'`
new_master_host=`echo "$7" | awk -F = '{print $2}'`
orig_master_ssh_user=`echo "${12}" | awk -F = '{print $2}'`
new_master_ssh_user=`echo "${13}" | awk -F = '{print $2}'`

#要求服务的网卡识别名一样(本机为ens33)
stop_vip=`echo "ssh root@$orig_master_host /usr/sbin/ifconfig ens33:$key down"`
start_vip=`echo "ssh root@$new_master_host /usr/sbin/ifconfig ens33:$key $vip"`

if [ $command = 'stop' ]
  then
    echo -e "\n\n\n****************************\n"
    echo -e "Disabled thi VIP - $vip on old master: $orig_master_host \n"
    $stop_vip
    if [ $? -eq 0 ]
      then
    echo "Disabled the VIP successfully"
      else
    echo "Disabled the VIP failed"
    fi
    echo -e "***************************\n\n\n"
  fi

if [ $command = 'start' -o $command = 'status' ]
  then
    echo -e "\n\n\n*************************\n"
    echo -e "Enabling the VIP - $vip on new master: $new_master_host \n"
    $start_vip
    if [ $? -eq 0 ]
      then
    echo "Enabled the VIP successfully"
      else
    echo "Enabled the VIP failed"
    fi
    echo -e "***************************\n\n\n"
fi

为脚本添加可执行权限

chmod +x master_ip_failover 
chmod +x master_ip_online_change

启动MHA集群

验证SSH状态
masterha_check_ssh --conf=/usr/local/mha/mha.cnf

验证数据库集群状态
masterha_check_repl --conf=/usr/local/mha/mha.cnf

[root@yfm01 scripts]# masterha_check_repl --conf=/usr/local/mha/mha.cnf
Wed Jan 27 19:34:25 2021 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Wed Jan 27 19:34:25 2021 - [info] Reading application default configuration from /usr/local/mha/mha.cnf..
Wed Jan 27 19:34:25 2021 - [info] Reading server configuration from /usr/local/mha/mha.cnf..
Wed Jan 27 19:34:25 2021 - [info] MHA::MasterMonitor version 0.58.
Wed Jan 27 19:34:25 2021 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.

 at /usr/local/share/perl5/MHA/DBHelper.pm line 208.
 at /usr/local/share/perl5/MHA/Server.pm line 166.
Wed Jan 27 19:34:25 2021 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.

 at /usr/local/share/perl5/MHA/DBHelper.pm line 208.
 at /usr/local/share/perl5/MHA/Server.pm line 166.
Wed Jan 27 19:34:25 2021 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted.
Compilation failed in require at (eval 37) line 3.

 at /usr/local/share/perl5/MHA/DBHelper.pm line 208.
 at /usr/local/share/perl5/MHA/Server.pm line 166.
Wed Jan 27 19:34:26 2021 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations
Wed Jan 27 19:34:26 2021 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations.  at /usr/local/share/perl5/MHA/MasterMonitor.pm line 329.
Wed Jan 27 19:34:26 2021 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers.
Wed Jan 27 19:34:26 2021 - [info] Got exit code 1 (Not master dead).

MySQL Replication Health is NOT OK!

解决办法(本人下载时翻墙了)
yum install -y cpan
cpan -D DBI 
cpan DBD::mysql

[root@yfm01 scripts]# cpan DBD::mysql
Reading '/root/.cpan/Metadata'
  Database was generated on Wed, 27 Jan 2021 13:55:40 GMT
Running install for module 'DBD::mysql'
Running make for D/DV/DVEEDEN/DBD-mysql-4.050.tar.gz
Checksum for /root/.cpan/sources/authors/id/D/DV/DVEEDEN/DBD-mysql-4.050.tar.gz ok

  CPAN.pm: Building D/DV/DVEEDEN/DBD-mysql-4.050.tar.gz

Can't exec "mysql_config": No such file or directory at Makefile.PL line 89.

Cannot find the file 'mysql_config'! Your execution PATH doesn't seem
not contain the path to mysql_config. Resorting to guessed values!


PLEASE NOTE:

For 'make test' to run properly, you must ensure that the
database user 'root' can connect to your MySQL server
and has the proper privileges that these tests require such
as 'drop table', 'create table', 'drop procedure', 'create procedure'
as well as others.

mysql> grant all privileges on test.* to 'root'@'localhost' identified by 's3kr1t';

You can also optionally set the user to run 'make test' with:

perl Makefile.PL --testuser=username

Can't exec "mysql_config": No such file or directory at Makefile.PL line 603.
Can't find mysql_config. Use --mysql_config option to specify where mysql_config is located
Failed to determine directory of mysql.h. Use

  perl Makefile.PL --cflags=-I<dir>

to set this directory. For details see DBD::mysql::INSTALL,
section "C Compiler flags" or type

  perl Makefile.PL --help
Warning: No success on command[/usr/bin/perl Makefile.PL]
  DVEEDEN/DBD-mysql-4.050.tar.gz
  /usr/bin/perl Makefile.PL -- NOT OK
Running make test
  Make had some problems, won't test
Running make install
  Make had some problems, won't install

检查是否包含mysql_config
which mysql_config
whereis mysql_config

yum install -y mysql-devel
然后再次执行cpan DBD::mysql

Wed Jan 27 20:25:42 2021 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='root' --slave_host=192.168.199.102 --slave_ip=192.168.199.102 --slave_port=3306 --workdir=/usr/local/mha --target_version=5.7.33-log --manager_version=0.58 --relay_log_info=/var/lib/mysql/relay-log.info  --relay_dir=/var/lib/mysql/  --slave_pass=xxx
Wed Jan 27 20:25:42 2021 - [info]   Connecting to [email protected](192.168.199.102:22).. 
  Checking slave recovery environment settings..
    Opening /var/lib/mysql/relay-log.info ... ok.
    Relay log found at /var/lib/mysql, up to mysql-relay.000009
    Temporary relay log file is /var/lib/mysql/mysql-relay.000009
    Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
    Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1227 (42000) at line 1: Access denied; you need (at least one of) the SUPER privilege(s) for this operation
mysql command failed with rc 1:0!
 at /usr/local/bin/apply_diff_relay_logs line 404.
	main::check() called at /usr/local/bin/apply_diff_relay_logs line 536
	eval {...} called at /usr/local/bin/apply_diff_relay_logs line 514
	main::main() called at /usr/local/bin/apply_diff_relay_logs line 121

yfm01
update user set Super_priv='Y' where User='root'
flush privileges

mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1142 (42000) at line 1: CREATE command denied to user 'root'@'192.168.199.102' for table 'apply_diff_relay_logs_test'
mysql command failed with rc 1:0!
 at /usr/local/bin/apply_diff_relay_logs line 404.
	main::check() called at /usr/local/bin/apply_diff_relay_logs line 536
	eval {...} called at /usr/local/bin/apply_diff_relay_logs line 514
	main::main() called at /usr/local/bin/apply_diff_relay_logs line 121
yfm01
GRANT ALL PRIVILEGES ON  *.* TO 'root'@'192.168.199.102'
GRANT ALL PRIVILEGES ON  *.* TO 'root'@'192.168.199.104'


yfm01节点 添加虚拟ip
ifconfig ens33:1 192.168.199.99/24

删除虚拟ip
ip addr del 192.168.199.99/24 dev ens33

启动mha进程
nohup masterha_manager --conf=/usr/local/mha/mha.cnf > /tmp/mha_manager.log < /dev/null 2>&1 &

检查MHA集群运行状态
[root@yfm01 scripts]# masterha_check_status --conf=/usr/local/mha/mha.cnf
mha (pid:44432) is running(0:PING_OK), master:192.168.199.101

验证测试

我这里搞了一番之后,现在yfm04是master了
模拟master宕机
systemctl stop mysqld
VIP 会漂移到新的MSATER上
[root@yfm04 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:46:21:69 brd ff:ff:ff:ff:ff:ff
    inet 192.168.199.104/24 brd 192.168.199.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet 192.168.199.99/24 brd 192.168.199.255 scope global secondary ens33:1
       valid_lft forever preferred_lft forever
    inet6 fe80::fca9:7d90:fd6:d63a/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
    inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
    inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
[root@yfm04 ~]# systemctl stop mysqld
[root@yfm04 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:46:21:69 brd ff:ff:ff:ff:ff:ff
    inet 192.168.199.104/24 brd 192.168.199.255 scope global noprefixroute ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::fca9:7d90:fd6:d63a/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
    inet6 fe80::be4f:8aab:e347:6e66/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever
    inet6 fe80::bf02:a10:c932:6672/64 scope link tentative noprefixroute dadfailed 
       valid_lft forever preferred_lft forever

切换过程需要注意的几个问题
1、切换过程会自动把read_only关闭
2.切换之后需要删除手工删除/usr/local/mha/mha.failover.complete,才能进行第二次测试
3、一旦发生切换管理进程将会退出,无法进行再次测试,需将故障数据库加入到MHA环境中来
4.原主节点重新加入到MHA时只能设置为slave
mysql> reset slave;
Query OK, 0 rows affected (0.00 sec)
mysql> CHANGE MASTER TO master_host = '192.168.199.104',master_user = 'root',master_password = '123456',master_log_file = 'mysql-bin.000007',master_log_pos = 154;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;

5.关于ip地址的接管有几种方式,这里采用的是MHA自动调用IP别名的方式,好处是在能够保证数据库状态与业务IP切换的一致性。启动管理节点之后 VIP会自动别名到当前主节点上,Keepalived也只能做到对3306的健康检查,但是做不到比如像MySQL复制中的Slave-SQL、Slave-IO进程的检查,容易出现对切换的误判。
6.注意:二级从服务器需要将log_slave_updates打开
7.手工切换需要先定义好master_ip_online_change_script脚本,不然只会切换mysql,IP地址不会绑定上去,可以根据模板来配置该脚本
8.通过设置no_master=1可以让某一个节点永远不成为新的主节点


在线切换master
masterha_master_switch --conf=/usr/local/mha/mha.cnf --master_state=alive --new_master_host=192.168.199.104 --orig_master_is_new_slave --running_updates_limit=10000 --interactive=0

Sat Jan 30 14:28:25 2021 - [error][/usr/local/share/perl5/MHA/MasterRotate.pm, ln143] Getting advisory lock failed on the current master. MHA Monitor runs on the current master. Stop MHA Manager/Monitor and try again.
Sat Jan 30 14:28:25 2021 - [error][/usr/local/share/perl5/MHA/ManagerUtil.pm, ln177] Got ERROR:  at /usr/local/bin/masterha_master_switch line 53.

masterha_stop --conf=/usr/local/mha/mha.cnf

Sat Jan 30 14:33:37 2021 - [info]  ok.
Sat Jan 30 14:33:37 2021 - [info] 192.168.199.104 can be new master.
Sat Jan 30 14:33:37 2021 - [info] 
From:
192.168.199.101(192.168.199.101:3306) (current master)
 +--192.168.199.102(192.168.199.102:3306)
 +--192.168.199.104(192.168.199.104:3306)

To:
192.168.199.104(192.168.199.104:3306) (new master)
 +--192.168.199.102(192.168.199.102:3306)
 +--192.168.199.101(192.168.199.101:3306)
Sat Jan 30 14:33:37 2021 - [info] Checking whether 192.168.199.104(192.168.199.104:3306) is ok for the new master..
Sat Jan 30 14:33:37 2021 - [info]  ok.
Sat Jan 30 14:33:37 2021 - [info] 192.168.199.101(192.168.199.101:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.

双主 以及 mycat热备
参考:

[MySQL高级7]MySQL主从复制及读写分离实战

Mycat+Mysql的读写分离

Mysql(双主)主主架构配置

MySQL 5.7 多主一从(多源复制)同步配置

实现mysql级联复制

MySQL + Keepalived 双主热备高可用操作记录

MySQL 高可用性—keepalived+mysql双主(有详细步骤和全部配置项解释)

Mysql主从(主从不同步解决办法,常见问题及解决办法,在线对mysql做主从复制)

MySQL mysqldump数据导出详解

MySQL 5.7半同步复制技术

MySQL半同步复制

mysql 5.7 开启半同步复制

Centos7搭建MySQL高可用MHA集群架构

MHA搭建及故障维护

CentOS 7 安装 Percona XtraDB Cluster 5.7

猜你喜欢

转载自blog.csdn.net/yfm081616/article/details/114235108