1、创建一个构造集群的脚本:cluster-start.sh
./redis-trib.rb create --replicas 1 192.168.58.101:7000 192.168.58.101:7001 192.168.58.101:7002 192.168.58.102:7000 192.168.58.102:7001 192.168.58.102:7002
2、启动redis各个节点服务脚本 :servers-start.sh
cd 7000 rm -rf appendonly.aof rm -rf dump.rdb rm -rf nodes.conf redis-server redis.conf cd .. cd 7001 rm -rf appendonly.aof rm -rf dump.rdb rm -rf nodes.conf redis-server redis.conf cd .. cd 7002 rm -rf appendonly.aof rm -rf dump.rdb rm -rf nodes.conf redis-server redis.conf
3、集群测试过程:
[root@localhost redis-cluster]# sh cluster-start.sh >>> Creating cluster Connecting to node 192.168.58.101:7000: OK Connecting to node 192.168.58.101:7001: OK Connecting to node 192.168.58.101:7002: OK Connecting to node 192.168.58.102:7000: OK Connecting to node 192.168.58.102:7001: OK Connecting to node 192.168.58.102:7002: OK >>> Performing hash slots allocation on 6 nodes... Using 3 masters: 192.168.58.102:7000 192.168.58.101:7000 192.168.58.102:7001 Adding replica 192.168.58.101:7001 to 192.168.58.102:7000 Adding replica 192.168.58.102:7002 to 192.168.58.101:7000 Adding replica 192.168.58.101:7002 to 192.168.58.102:7001 M: aec976f33acd4971cf5e087ceaf2b5e606c56f36 192.168.58.101:7000 slots:5461-10922 (5462 slots) master S: 514548a9a01d7e125d716fd51d9ffd36165a2647 192.168.58.101:7001 replicates d5ec4dee922385007f09005d0ef24024f3d513a3 S: 3a1962324921188520896b1e9329e210906c1641 192.168.58.101:7002 replicates 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b M: d5ec4dee922385007f09005d0ef24024f3d513a3 192.168.58.102:7000 slots:0-5460 (5461 slots) master M: 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b 192.168.58.102:7001 slots:10923-16383 (5461 slots) master S: 990c7f1b44034646cacb51a7754668ee5ada6005 192.168.58.102:7002 replicates aec976f33acd4971cf5e087ceaf2b5e606c56f36 Can I set the above configuration? (type 'yes' to accept): yes >>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join.. >>> Performing Cluster Check (using node 192.168.58.101:7000) M: aec976f33acd4971cf5e087ceaf2b5e606c56f36 192.168.58.101:7000 slots:5461-10922 (5462 slots) master M: 514548a9a01d7e125d716fd51d9ffd36165a2647 192.168.58.101:7001 slots: (0 slots) master replicates d5ec4dee922385007f09005d0ef24024f3d513a3 M: 3a1962324921188520896b1e9329e210906c1641 192.168.58.101:7002 slots: (0 slots) master replicates 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b M: d5ec4dee922385007f09005d0ef24024f3d513a3 192.168.58.102:7000 slots:0-5460 (5461 slots) master M: 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b 192.168.58.102:7001 slots:10923-16383 (5461 slots) master M: 990c7f1b44034646cacb51a7754668ee5ada6005 192.168.58.102:7002 slots: (0 slots) master replicates aec976f33acd4971cf5e087ceaf2b5e606c56f36 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. ./redis-trib.rb check 192.168.58.102:7000 --检查集群状态的命令 check 后面的ip:port只要是集群内部的任意节点 [root@localhost redis-cluster]# ./redis-trib.rb check 192.168.58.102:7000 Connecting to node 192.168.58.102:7000: OK Connecting to node 192.168.58.101:7002: OK Connecting to node 192.168.58.101:7001: OK Connecting to node 192.168.58.102:7001: OK Connecting to node 192.168.58.101:7000: OK Connecting to node 192.168.58.102:7002: OK >>> Performing Cluster Check (using node 192.168.58.102:7000) M: d5ec4dee922385007f09005d0ef24024f3d513a3 192.168.58.102:7000 slots:0-5460 (5461 slots) master 1 additional replica(s) S: 3a1962324921188520896b1e9329e210906c1641 192.168.58.101:7002 slots: (0 slots) slave replicates 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b S: 514548a9a01d7e125d716fd51d9ffd36165a2647 192.168.58.101:7001 slots: (0 slots) slave replicates d5ec4dee922385007f09005d0ef24024f3d513a3 M: 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b 192.168.58.102:7001 slots:10923-16383 (5461 slots) master 1 additional replica(s) M: aec976f33acd4971cf5e087ceaf2b5e606c56f36 192.168.58.101:7000 slots:5461-10922 (5462 slots) master 1 additional replica(s) S: 990c7f1b44034646cacb51a7754668ee5ada6005 192.168.58.102:7002 slots: (0 slots) slave replicates aec976f33acd4971cf5e087ceaf2b5e606c56f36 [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. 我在两台机器上面配置了6个节点,集群式,自动生成3个Master节点下面挂有3个Slave节点。 2、模拟某台主节点挂掉的情况,(我直接用kill命令杀死某个节点进程) 测试删除 192.168.102:7000,id为 d5ec4dee922385007f09005d0ef24024f3d513a3 的主节点,按照redis的主从机制,当主节点挂掉后,从节点补上成为主节点, 那么它的从节点 514548a9a01d7e125d716fd51d9ffd36165a2647 应该会成为主节点。 192.168.183.102下面: [root@localhost redis-cluster]# ps -ef | grep redis root 2460 1 0 14:00 ? 00:00:03 redis-server *:7000 [cluster] root 2465 1 0 14:00 ? 00:00:03 redis-server *:7001 [cluster] root 2474 1 0 14:00 ? 00:00:03 redis-server *:7002 [cluster] root 2572 2378 0 14:16 pts/0 00:00:00 grep redis [root@localhost redis-cluster]# kill 2460 [root@localhost redis-cluster]# ./redis-trib.rb check 192.168.58.102:7000 Connecting to node 192.168.58.102:7000: [ERR] Sorry, can't connect to node 192.168.58.102:7000 [root@localhost redis-cluster]# ./redis-trib.rb check 192.168.58.102:7000 Connecting to node 192.168.58.102:7000: [ERR] Sorry, can't connect to node 192.168.58.102:7000 [root@localhost redis-cluster]# ./redis-trib.rb check 192.168.58.102:7002 Connecting to node 192.168.58.102:7002: OK Connecting to node 192.168.58.101:7002: OK Connecting to node 192.168.58.101:7000: OK Connecting to node 192.168.58.101:7001: OK Connecting to node 192.168.58.102:7001: OK >>> Performing Cluster Check (using node 192.168.58.102:7002) S: 990c7f1b44034646cacb51a7754668ee5ada6005 192.168.58.102:7002 slots: (0 slots) slave replicates aec976f33acd4971cf5e087ceaf2b5e606c56f36 S: 3a1962324921188520896b1e9329e210906c1641 192.168.58.101:7002 slots: (0 slots) slave replicates 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b M: aec976f33acd4971cf5e087ceaf2b5e606c56f36 192.168.58.101:7000 slots:5461-10922 (5462 slots) master 1 additional replica(s) M: 514548a9a01d7e125d716fd51d9ffd36165a2647 192.168.58.101:7001 slots:0-5460 (5461 slots) master 0 additional replica(s) M: 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b 192.168.58.102:7001 slots:10923-16383 (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. 果然和预期的一致,集群中被杀死的进程的节点d5ec4dee922385007f09005d0ef24024f3d513a3已经不见了,它的从节点 514548a9a01d7e125d716fd51d9ffd36165a2647成为了主节点,同时当节点被杀死后,直接访问该节点会连接不上的错误。 继续删除192.168.183.101:7000,id为 aec976f33acd4971cf5e087ceaf2b5e606c56f36 的主节点 192.168.183.101下面: [root@localhost src]# ps -ef | grep redis root 2483 1 0 14:00 ? 00:00:13 redis-server *:7000 [cluster] root 2488 1 0 14:00 ? 00:00:13 redis-server *:7001 [cluster] root 2497 1 0 14:00 ? 00:00:13 redis-server *:7002 [cluster] root 3001 2352 0 14:44 pts/0 00:00:00 grep redis [root@localhost src]# kill 2483 [root@localhost src]# ./redis-trib.rb check 192.168.58.102:7001 Connecting to node 192.168.58.102:7001: OK Connecting to node 192.168.58.101:7002: OK Connecting to node 192.168.58.101:7001: OK Connecting to node 192.168.58.102:7002: OK >>> Performing Cluster Check (using node 192.168.58.102:7001) M: 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b 192.168.58.102:7001 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: 3a1962324921188520896b1e9329e210906c1641 192.168.58.101:7002 slots: (0 slots) slave replicates 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b M: 514548a9a01d7e125d716fd51d9ffd36165a2647 192.168.58.101:7001 slots:0-5460 (5461 slots) master 0 additional replica(s) M: 990c7f1b44034646cacb51a7754668ee5ada6005 192.168.58.102:7002 slots:5461-10922 (5462 slots) master 0 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. 和预期的一样,它的从节点990c7f1b44034646cacb51a7754668ee5ada6005变成了主节点。 结论:如果删除的主节点有从节点,删除后,不影响集群; 下面继续删除没有从节点的192.168.183.101:7001(id为 514548a9a01d7e125d716fd51d9ffd36165a2647)的主节点 192.168.183.101下面: [root@localhost src]# ps -ef | grep redis root 2488 1 0 14:00 ? 00:00:15 redis-server *:7001 [cluster] root 2497 1 0 14:00 ? 00:00:15 redis-server *:7002 [cluster] root 3029 2352 0 14:49 pts/0 00:00:00 grep redis [root@localhost src]# kill 2488 [root@localhost src]# ./redis-trib.rb check 192.168.58.102:7001 Connecting to node 192.168.58.102:7001: OK Connecting to node 192.168.58.101:7002: OK Connecting to node 192.168.58.102:7002: OK >>> Performing Cluster Check (using node 192.168.58.102:7001) M: 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b 192.168.58.102:7001 slots:10923-16383 (5461 slots) master 1 additional replica(s) S: 3a1962324921188520896b1e9329e210906c1641 192.168.58.101:7002 slots: (0 slots) slave replicates 3c2a3600f9b8ea11f7991c8180ecc24ea4266a6b M: 990c7f1b44034646cacb51a7754668ee5ada6005 192.168.58.102:7002 slots:5461-10922 (5462 slots) master 0 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [ERR] Not all 16384 slots are covered by nodes. [root@localhost src]# redis-cli -c -h 192.168.58.101 -p 7001 Could not connect to Redis at 192.168.58.101:7001: Connection refused not connected> set b [root@localhost src]# redis-cli -c -h 192.168.58.102 -p 7001 192.168.58.102:7001> set b c (error) CLUSTERDOWN The cluster is down. Use CLUSTER INFO for more information 192.168.58.102:7001> 发现集群失败,某些slot槽无法覆盖,当调用某个节点赋值式会报(error) CLUSTERDOWN The cluster is down.的异常。 暂时没有发现好的办法,只能删除节点文件下除redis.conf外其他文件,并杀死各节点的进程,重新启动服务,再构造集群。