实现redis哨兵,模拟master故障场景

实现redis哨兵,模拟master故障场景

1 redis 哨兵(Sentinel)

1.1 redis 集群介绍

主从架构无法实现master和slave角色的自动切换,即当master出现redis服务异常、主机断电、磁盘损坏等问题导致master无法使用,而redis主从复制无法实现自动的故障转移(将slave 自动提升为新master),需要手动修改环境配置,才能切换到slave redis服务器,另外当单台Redis服务器性能无法满足业务写入需求的时候,也无法横向扩展Redis服务的并行写入性能

需要解决以上的两个核心问题:

  • master和slave角色的无缝切换,让业务无感知从而不影响业务使用
  • 可横向动态扩展Redis服务器,从而实现多台服务器并行写入以实现更高并发的目的。

Redis 集群实现方式:

  • 客户端分片:由应用决定将不同的KEY发送到不同的Redis服务器
  • 代理分片:由代理决定将不同的KEY发送到不同的Redis服务器,代理程序如:codis,twemproxy等
  • Redis Cluster:着眼于扩展性,在单个redis内存不足时,使用Cluster进行分片存储。

1.2 哨兵 (Sentinel) 工作原理

Redis Sentinal:着眼于高可用,在master宕机时会自动将slave提升为master,继续提供服务。

Sentinel 故障转移

(1)多个sentinel发现并确认master有问题
(2)选举出一个sentinel作为领导
(3)选出一个slave作为master
(4)通知其余slave成为新master的slave
(5)通知客户端主从变化
(6)等待老的master复活成为新master的slave

Sentinel 进程是用于监控redis集群中Master主服务器工作的状态,在Master主服务器发生故障的时候,可以实现Master和Slave服务器的切换,保证系统的高可用,此功能在redis2.6+的版本已引用,Redis的哨兵模式到了2.8版本之后就稳定了下来。一般在生产环境也建议使用Redis的2.8版本的以后版本

哨兵(Sentinel) 是一个分布式系统,可以在一个架构中运行多个哨兵(sentinel) 进程,这些进程使用流言协议(gossip protocols)来接收关于Master主服务器是否下线的信息,并使用投票协议(Agreement Protocols)来决定是否执行自动故障迁移,以及选择哪个Slave作为新的Master

每个哨兵(Sentinel)进程会向其它哨兵(Sentinel)、Master、Slave定时发送消息,以确认对方是否”活”着,如果发现对方在指定配置时间(此项可配置)内未得到回应,则暂时认为对方已离线,也就是所谓的”主观认为宕机” (主观:是每个成员都具有的独自的而且可能相同也可能不同的意识),英文名称:Subjective Down,简称SDOWN

有主观宕机,对应的有客观宕机。当“哨兵群”中的多数Sentinel进程在对Master主服务器做出SDOWN的判断,并且通过 SENTINEL is-master-down-by-addr 命令互相交流之后,得出的Master Server下线判断,这种方式就是“客观宕机”(客观:是不依赖于某种意识而已经实际存在的一切事物),英文名称是:Objectively Down, 简称 ODOWN

通过一定的vote算法,从剩下的slave从服务器节点中,选一台提升为Master服务器节点,然后自动修改相关配置,并开启故障转移(failover)

Sentinel 机制可以解决master和slave角色的自动切换问题,但单个 Master 的性能瓶颈问题无法解决,类似于MySQL中的MHA功能

Redis Sentinel中的Sentinel节点个数应该为大于等于3且最好为奇数

客户端初始化时连接的是Sentinel节点集合,不再是具体的Redis节点,但Sentinel只是配置中心不是代理。

Redis Sentinel 节点与普通redis 没有区别,要实现读写分离依赖于客户端程序

redis 3.0 之前版本中,生产环境一般使用哨兵模式,3.0后推出redis cluster功能,可以支持更大规模的生产环境

sentinel中的三个定时任务

  • 每10秒每个sentinel对master和slave执行info
    发现slave节点
    确认主从关系
  • 每2秒每个sentinel通过master节点的channel交换信息(pub/sub)
    通过sentinel__:hello频道交互
    交互对节点的“看法”和自身信息
  • 每1秒每个sentinel对其他sentinel和redis执行ping

2 实现redis哨兵

准备三台机器:

节点 IP Redis版本
master Sentinel 10.0.0.7 Redis-6.2.4
slave1 Sentinel 10.0.0.17 Redis-6.2.4
slave2 Sentinel 10.0.0.27 Redis-6.2.4

2.1 哨兵的准备实现主从复制架构

哨兵的前提是已经实现了一个redis的主从复制的运行环境,从而实现一个一主两从基于哨兵的高可用redis架构

注意:master 的配置文件中masterauth 和slave 都必须相同

所有主从节点的redis.conf中关健配置

#在所有主从节点执行
[root@master ~]#vim /apps/redis/etc/redis.conf
bind 0.0.0.0
masterauth 123456
requirepass 123456

#在所有从节点执行
[root@slave1 ~]#vim /apps/redis/etc/redis.conf
replicaof 10.0.0.7 6379

#在所有主从节点执行
[root@master ~]#systemctl restart redis.service

查看master服务器状态

[root@master ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.0.0.17,port=6379,state=online,offset=182,lag=0
slave1:ip=10.0.0.27,port=6379,state=online,offset=182,lag=1
master_failover_state:no-failover
master_replid:1067b30c7ff6a77e0b2655096cb057d0ffea30e1
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:182
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:182

查看slave1状态

[root@slave1 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:10.0.0.7
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:308
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:1067b30c7ff6a77e0b2655096cb057d0ffea30e1
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:308
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:308

查看slave2状态

[root@slave2 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:10.0.0.7
master_port:6379
master_link_status:up
master_last_io_seconds_ago:2
master_sync_in_progress:0
slave_repl_offset:406
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:1067b30c7ff6a77e0b2655096cb057d0ffea30e1
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:406
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:406

2.2 编辑哨兵的配置文件

Sentinel实际上是一个特殊的redis服务器,有些redis指令支持,但很多指令并不支持。默认监听在26379/tcp端口

哨兵可以不和Redis服务器部署在一起,但一般部署在一起以节约成本

所有redis节点使用相同的以下示例的配置文件

#如果是编译安装,在源码目录有sentinel.conf,复制到安装目录即可,如:/apps/redis/etc/sentinel.conf
[root@master ~]#cd redis-6.2.4/
[root@master redis-6.2.4]#ls
00-RELEASENOTES  CONTRIBUTING  INSTALL    README.md   runtest-cluster    sentinel.conf  TLS.md
BUGS             COPYING       Makefile   redis.conf  runtest-moduleapi  src            utils
CONDUCT          deps          MANIFESTO  runtest     runtest-sentinel   tests

[root@master redis-6.2.4]#cp sentinel.conf /apps/redis/etc/
[root@master redis-6.2.4]#vim /apps/redis/etc/sentinel.conf
bind 0.0.0.0
port 26379
daemonize yes
pidfile /apps/redis/run/redis-sentinel.pid
logfile /apps/redis/log/sentinel_26379.log
dir /tmp

sentinel monitor mymaster 10.0.0.7 6379 2
#mymaster是集群的名称,此行指定当前mymaster集群中master服务器的地址和端口
#2为法定人数限制(quorum),即有几个sentinel认为master down了就进行故障转移,一般此值是所有
sentinel节点(一般总数是>=3的奇数,如:3,5,7等)的一半以上的整数值,比如,总数是3,即3/2=1.5,
取整为2,是master的ODOWN客观下线的依据

sentinel auth-pass mymaster 123456
#mymaster集群中master的密码,注意此行要在上面行的下面

sentinel down-after-milliseconds mymaster 3000
#(SDOWN)判断mymaster集群中所有节点的主观下线的时间,单位:毫秒,建议3000

sentinel parallel-syncs mymaster 1
#发生故障转移后,可以同时向新master同步数据的slave的数量,数字越小总同步时间越长,但可以减轻新master的负载压力

sentinel failover-timeout mymaster 180000
#所有slaves指向新的master所需的超时时间,单位:毫秒

sentinel deny-scripts-reconfig yes  #禁止修改脚本

[root@master redis-6.2.4]#scp /apps/redis/etc/sentinel.conf 10.0.0.17:/apps/redis/etc/
[root@master redis-6.2.4]#scp /apps/redis/etc/sentinel.conf 10.0.0.27:/apps/redis/etc/

三个哨兵服务器的配置都如下

[root@master redis-6.2.4]#grep -vE "^#|^$" /apps/redis/etc/sentinel.conf
bind 0.0.0.0
port 26379
daemonize yes
pidfile /apps/redis/run/redis-sentinel.pid
logfile /apps/redis/log/sentinel_26379.log
dir /tmp
sentinel monitor mymaster 10.0.0.7 6379 2
sentinel auth-pass mymaster 123456
sentinel down-after-milliseconds mymaster 3000
acllog-max-len 128
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 180000
sentinel deny-scripts-reconfig yes
SENTINEL resolve-hostnames no
SENTINEL announce-hostnames no

2.3 启动哨兵

三台哨兵服务器都要启动

#如果是编译安装在所有节点生成新的service文件
[root@master redis-6.2.4]#vim /lib/systemd/system/redis-sentinel.service
[Unit]
Description=Redis Sentinel
After=network.target

[Service]
ExecStart=/apps/redis/bin/redis-sentinel /apps/redis/etc/sentinel.conf --supervised systemd

ExecStop=/bin/kill -s QUIT $MAINPID
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755

[Install]
WantedBy=multi-user.target

[root@master redis-6.2.4]#scp /lib/systemd/system/redis-sentinel.service 10.0.0.17:/lib/systemd/system/redis-sentinel.service
[root@master redis-6.2.4]#scp /lib/systemd/system/redis-sentinel.service 10.0.0.27:/lib/systemd/system/redis-sentinel.service

#注意所有节点的目录权限,否则无法启动服务
[root@master redis-6.2.4]#chown -R redis.redis /apps/redis/
[root@master redis-6.2.4]#ll /apps/redis/etc/sentinel.conf
-rw-r--r-- 1 redis redis 14292 May  5 13:03 /apps/redis/etc/sentinel.conf

[root@slave1 ~]#chown -R redis.redis /apps/redis/
[root@slave1 ~]#ll /apps/redis/etc/sentinel.conf
-rw-r--r-- 1 redis redis 14291 May  5 13:03 /apps/redis/etc/sentinel.conf

[root@slave2 ~]#chown -R redis.redis /apps/redis/
[root@slave2 ~]#ll /apps/redis/etc/sentinel.conf
-rw-r--r-- 1 redis redis 14291 May  5 13:03 /apps/redis/etc/sentinel.conf

#重新加载配置文件
[root@master redis-6.2.4]#systemctl daemon-reload
[root@slave1 ~]#systemctl daemon-reload
[root@slave2 ~]#systemctl daemon-reload

#如果是编译安装,在所有哨兵服务器执行下面操作启动哨兵
[root@master redis-6.2.4]#/apps/redis/bin/redis-sentinel /apps/redis/etc/sentinel.conf

2.4 验证三台哨兵服务器端口

[root@master redis-6.2.4]#ss -ntl
State      Recv-Q Send-Q              Local Address:Port                             Peer Address:Port
LISTEN     0      511                             *:26379                                       *:*
LISTEN     0      511                             *:6379                                        *:*
LISTEN     0      128                             *:22                                          *:*
LISTEN     0      100                     127.0.0.1:25                                          *:*
LISTEN     0      511                         [::1]:6379                                     [::]:*
LISTEN     0      128                          [::]:22                                       [::]:*
LISTEN     0      100                         [::1]:25                                       [::]:*

[root@slave1 ~]#ss -ntl
State      Recv-Q Send-Q              Local Address:Port                             Peer Address:Port
LISTEN     0      511                             *:26379                                       *:*
LISTEN     0      511                             *:6379                                        *:*
LISTEN     0      128                             *:22                                          *:*
LISTEN     0      100                     127.0.0.1:25                                          *:*
LISTEN     0      511                         [::1]:6379                                     [::]:*
LISTEN     0      128                          [::]:22                                       [::]:*
LISTEN     0      100                         [::1]:25                                       [::]:*

[root@slave2 ~]#ss -ntl
State      Recv-Q Send-Q              Local Address:Port                             Peer Address:Port
LISTEN     0      128                             *:22                                          *:*
LISTEN     0      100                     127.0.0.1:25                                          *:*
LISTEN     0      511                             *:26379                                       *:*
LISTEN     0      511                             *:6379                                        *:*
LISTEN     0      128                          [::]:22                                       [::]:*
LISTEN     0      100                         [::1]:25                                       [::]:*
LISTEN     0      511                         [::1]:6379                                     [::]:*

2.5 查看哨兵日志

master的哨兵日志

[root@master ~]#tail -f /apps/redis/log/sentinel_26379.log
2759:X 05 May 2022 11:23:51.222 * Removing the pid file.
2759:X 05 May 2022 11:23:51.223 # Sentinel is now ready to exit, bye bye...
2811:X 05 May 2022 11:35:03.800 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
2811:X 05 May 2022 11:35:03.800 # Redis version=6.2.4, bits=64, commit=00000000, modified=0, pid=2811, just started
2811:X 05 May 2022 11:35:03.800 # Configuration loaded
2811:X 05 May 2022 11:35:03.810 * Increased maximum number of open files to 10032 (it was originally set to 1024).
2811:X 05 May 2022 11:35:03.810 * monotonic clock: POSIX clock_gettime
2811:X 05 May 2022 11:35:03.815 * Running mode=sentinel, port=26379.
2811:X 05 May 2022 11:35:03.817 # Sentinel ID is e432c1d2045d68f260da2f755fb706ed889ee3d9
2811:X 05 May 2022 11:35:03.817 # +monitor master mymaster 10.0.0.7 6379 quorum 2

slave的哨兵日志

[root@slave1 ~]#tail -f /apps/redis/log/sentinel_26379.log
2712:X 05 May 2022 11:35:00.115 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
2712:X 05 May 2022 11:35:00.116 # Redis version=6.2.4, bits=64, commit=00000000, modified=0, pid=2712, just started
2712:X 05 May 2022 11:35:00.116 # Configuration loaded
2712:X 05 May 2022 11:35:00.121 * Increased maximum number of open files to 10032 (it was originally set to 1024).
2712:X 05 May 2022 11:35:00.122 * monotonic clock: POSIX clock_gettime
2712:X 05 May 2022 11:35:00.123 * Running mode=sentinel, port=26379.
2712:X 05 May 2022 11:35:00.124 # Sentinel ID is 19e17c40424e15f994f98592c9d2957171603a10
2712:X 05 May 2022 11:35:00.124 # +monitor master mymaster 10.0.0.7 6379 quorum 2
2712:X 05 May 2022 11:35:03.153 # +sdown sentinel e432c1d2045d68f260da2f755fb706ed889ee3d9 10.0.0.7 26379 @ mymaster 10.0.0.7 6379
2712:X 05 May 2022 11:35:04.369 # -sdown sentinel e432c1d2045d68f260da2f755fb706ed889ee3d9 10.0.0.7 26379 @ mymaster 10.0.0.7 6379

[root@slave2 ~]#tail -f /apps/redis/log/sentinel_26379.log
2752:X 05 May 2022 11:34:50.876 # Configuration loaded
2752:X 05 May 2022 11:34:50.878 * Increased maximum number of open files to 10032 (it was originally set to 1024).
2752:X 05 May 2022 11:34:50.880 * monotonic clock: POSIX clock_gettime
2752:X 05 May 2022 11:34:50.889 * Running mode=sentinel, port=26379.
2752:X 05 May 2022 11:34:50.890 # Sentinel ID is 6e3c75087dcb5e35aceee95db2854f4e1cb18bee
2752:X 05 May 2022 11:34:50.890 # +monitor master mymaster 10.0.0.7 6379 quorum 2
2752:X 05 May 2022 11:34:53.925 # +sdown sentinel 19e17c40424e15f994f98592c9d2957171603a10 10.0.0.17 26379 @ mymaster10.0.0.7 6379
2752:X 05 May 2022 11:34:53.926 # +sdown sentinel e432c1d2045d68f260da2f755fb706ed889ee3d9 10.0.0.7 26379 @ mymaster 10.0.0.7 6379
2752:X 05 May 2022 11:35:00.252 # -sdown sentinel 19e17c40424e15f994f98592c9d2957171603a10 10.0.0.17 26379 @ mymaster10.0.0.7 6379
2752:X 05 May 2022 11:35:04.557 # -sdown sentinel e432c1d2045d68f260da2f755fb706ed889ee3d9 10.0.0.7 26379 @ mymaster 10.0.0.7 6379

2.6 查看sentinel状态

在sentinel状态中尤其是最后一行,涉及到master IP是多少,有几个slave,有几个sentinels,必须是符合全部服务器数量

[root@master ~]#redis-cli -p 26379
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.7:6379,slaves=2,sentinels=3

#两个slave,三个sentinel服务器,如果sentinels值不符合,检查myid可能冲突

2.7 测试数据同步

[root@master ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> set date 20220505
OK

[root@slave1 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> get date
"20220505"

[root@slave2 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> get date
"20220505"

2.8 停止Redis Master 节点测试故障转移

[root@master ~]#systemctl stop redis.service

查看各节点上哨兵信息:

[root@master ~]#redis-cli -p 26379
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.27:6379,slaves=2,sentinels=3

#10.0.0.27 slave2自动切换为master

故障转移时sentinel的信息:

[root@slave1 ~]#tail -f /apps/redis/log/sentinel_26379.log
2712:X 05 May 2022 13:03:56.875 # +failover-state-reconf-slaves master mymaster 10.0.0.7 6379
2712:X 05 May 2022 13:03:56.940 * +slave-reconf-sent slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.7 6379
2712:X 05 May 2022 13:03:57.152 * +slave-reconf-inprog slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.7 6379
2712:X 05 May 2022 13:03:57.331 # -odown master mymaster 10.0.0.7 6379
2712:X 05 May 2022 13:03:58.228 * +slave-reconf-done slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.7 6379
2712:X 05 May 2022 13:03:58.461 # +failover-end master mymaster 10.0.0.7 6379
2712:X 05 May 2022 13:03:58.461 # +switch-master mymaster 10.0.0.7 6379 10.0.0.27 6379
2712:X 05 May 2022 13:03:58.462 * +slave slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.27 6379
2712:X 05 May 2022 13:03:58.462 * +slave slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379
2712:X 05 May 2022 13:04:01.469 # +sdown slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379

[root@master ~]#tail -f /apps/redis/log/sentinel_26379.log
2811:X 05 May 2022 11:35:03.817 # Sentinel ID is e432c1d2045d68f260da2f755fb706ed889ee3d9
2811:X 05 May 2022 11:35:03.817 # +monitor master mymaster 10.0.0.7 6379 quorum 2
2811:X 05 May 2022 13:03:55.957 # +sdown master mymaster 10.0.0.7 6379
2811:X 05 May 2022 13:03:56.184 # +new-epoch 1
2811:X 05 May 2022 13:03:56.245 # +vote-for-leader 19e17c40424e15f994f98592c9d2957171603a10 1
2811:X 05 May 2022 13:03:56.944 # +config-update-from sentinel 19e17c40424e15f994f98592c9d2957171603a10 10.0.0.17 26379 @ mymaster 10.0.0.7 6379
2811:X 05 May 2022 13:03:56.945 # +switch-master mymaster 10.0.0.7 6379 10.0.0.27 6379
2811:X 05 May 2022 13:03:56.945 * +slave slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.27 6379
2811:X 05 May 2022 13:03:56.983 * +slave slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379
2811:X 05 May 2022 13:04:00.055 # +sdown slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379

[root@slave2 ~]#tail -f /apps/redis/log/sentinel_26379.log
2752:X 05 May 2022 13:03:56.083 # +new-epoch 1
2752:X 05 May 2022 13:03:56.084 # +try-failover master mymaster 10.0.0.7 6379
2752:X 05 May 2022 13:03:56.132 # +vote-for-leader 6e3c75087dcb5e35aceee95db2854f4e1cb18bee 1
2752:X 05 May 2022 13:03:56.153 # 19e17c40424e15f994f98592c9d2957171603a10 voted for 19e17c40424e15f994f98592c9d2957171603a10 1
2752:X 05 May 2022 13:03:56.244 # e432c1d2045d68f260da2f755fb706ed889ee3d9 voted for 19e17c40424e15f994f98592c9d2957171603a10 1
2752:X 05 May 2022 13:03:56.938 # +config-update-from sentinel 19e17c40424e15f994f98592c9d2957171603a10 10.0.0.17 26379 @ mymaster 10.0.0.7 6379
2752:X 05 May 2022 13:03:56.939 # +switch-master mymaster 10.0.0.7 6379 10.0.0.27 6379
2752:X 05 May 2022 13:03:56.940 * +slave slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.27 6379
2752:X 05 May 2022 13:03:56.940 * +slave slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379
2752:X 05 May 2022 13:03:59.963 # +sdown slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379

2.9 故障转移后的redis配置文件会被自动修改

故障转移后redis.conf中的replicaof行的master IP会被修改

[root@slave1 ~]#grep ^replicaof /apps/redis/etc/redis.conf
replicaof 10.0.0.27 6379

哨兵配置文件的sentinel monitor IP 同样也会被修改

[root@slave1 ~]#grep "monitor mymaster" /apps/redis/etc/sentinel.conf
sentinel monitor mymaster 10.0.0.27 6379 2  #自动修改此行

[root@slave2 ~]#grep "monitor mymaster" /apps/redis/etc/sentinel.conf
sentinel monitor mymaster 10.0.0.27 6379 2

[root@master ~]#grep "monitor mymaster" /apps/redis/etc/sentinel.conf
sentinel monitor mymaster 10.0.0.27 6379 2

2.10 查看当前 redis状态

新的master 状态

[root@slave2 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:master  #提升为master
connected_slaves:1
slave0:ip=10.0.0.17,port=6379,state=online,offset=1867449,lag=0
master_failover_state:no-failover
master_replid:58f2e7e6203f5dea3021a3ad6b627849cd1a80c6
master_replid2:1067b30c7ff6a77e0b2655096cb057d0ffea30e1
master_repl_offset:1867449
second_repl_offset:1112913
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:818874
repl_backlog_histlen:1048576

[root@slave1 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:10.0.0.27  #指向新的master
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:1826043
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:58f2e7e6203f5dea3021a3ad6b627849cd1a80c6
master_replid2:1067b30c7ff6a77e0b2655096cb057d0ffea30e1
master_repl_offset:1826043
second_repl_offset:1112913
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:777468
repl_backlog_histlen:1048576

2.11 测试数据同步

[root@slave2 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> set fruit orange
OK

[root@slave1 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> get fruit
"orange"

2.12 恢复故障的原master重新加入redis集群

[root@master ~]#systemctl start redis.service
[root@master ~]#grep ^replicaof /apps/redis/etc/redis.conf
replicaof 10.0.0.27 6379

在原 master上观察状态

[root@master ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:slave
master_host:10.0.0.27
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:2398880
slave_priority:100
slave_read_only:1
replica_announced:1
connected_slaves:0
master_failover_state:no-failover
master_replid:58f2e7e6203f5dea3021a3ad6b627849cd1a80c6
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:2398880
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:2325269
repl_backlog_histlen:73612
127.0.0.1:6379> get fruit
"orange"

[root@master ~]#redis-cli -p 26379
127.0.0.1:26379> INFO sentinel
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=10.0.0.27:6379,slaves=2,sentinels=3

观察新master上状态和日志

[root@slave2 ~]#redis-cli
127.0.0.1:6379> AUTH 123456
OK
127.0.0.1:6379> INFO replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.0.0.17,port=6379,state=online,offset=3034646,lag=0
slave1:ip=10.0.0.7,port=6379,state=online,offset=3034646,lag=1
master_failover_state:no-failover
master_replid:58f2e7e6203f5dea3021a3ad6b627849cd1a80c6
master_replid2:1067b30c7ff6a77e0b2655096cb057d0ffea30e1
master_repl_offset:3034646
second_repl_offset:1112913
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1986071
repl_backlog_histlen:1048576

[root@slave2 ~]#tail -f /apps/redis/log/sentinel_26379.log
2752:X 05 May 2022 13:03:56.939 # +switch-master mymaster 10.0.0.7 6379 10.0.0.27 6379
2752:X 05 May 2022 13:03:56.940 * +slave slave 10.0.0.17:6379 10.0.0.17 6379 @ mymaster 10.0.0.27 6379
2752:X 05 May 2022 13:03:56.940 * +slave slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379
2752:X 05 May 2022 13:03:59.963 # +sdown slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379
2752:X 05 May 2022 14:46:15.590 # -sdown slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379
2752:X 05 May 2022 14:46:25.578 * +convert-to-slave slave 10.0.0.7:6379 10.0.0.7 6379 @ mymaster 10.0.0.27 6379

猜你喜欢

转载自blog.csdn.net/weixin_51867896/article/details/124594015