ZooKeeper如何模拟会话失效(Session Expired)

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/u010096900/article/details/81909498

简介

会话对于ZooKeeper的操作非常重要,当会话由于任何原因结束时,在该会话期间创建的临时节点会被删除。在生产环境中,我们需要处理由于网络问题导致的会话超时问题,当网络恢复时,应用能够自动恢复会话,保证服务的可用性。本文将讲解如何模拟会话超时,便于在生产环境中进行应用的测试。

应用场景

会话对于ZooKeeper的操作非常重要。会话中的请求按FIFO顺序执行,一旦客户端连接到服务器,将建立会话并向客户端分配会话ID 。客户端以特定的时间间隔发送心跳以保持会话有效。如果ZooKeeper集合在超过服务器开启时指定的期间(会话超时)都没有从客户端接收到心跳,则它会判定客户端死机。当会话由于任何原因结束时,在该会话期间创建的临时节点也会被删除。

为确保网络的健壮性,需要应用能够自动恢复会话,并重新创建临时节点。对测试工作来说,需要模拟出会话失效,以对相关功能进行测试。

在下面的场景中,由10.77.16.40:2181,10.77.16.60:2181,10.77.16.67:2181组成一个ZooKeeper的集群应用,应用部署在10.23.3.85服务器上,并向ZooKeeper注册服务。

在服务成功注册后,可以查看到相应的节点信息:

[zk: 10.77.16.40:2181(CONNECTED) 16] ls2 /wg/index_server/vertical_70/shard_0 watch
[search0000000041]
cZxid = 0x206206d5d1c3
ctime = Thu Aug 16 17:00:20 CST 2018
mZxid = 0x206206d5d1c3
mtime = Thu Aug 16 17:00:20 CST 2018
pZxid = 0x2062070de073
cversion = 83
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 1

模拟会话失效

一种比较简便的模拟会话失效的方式,就是利用本地的防火墙功能,来丢弃相关的网络报文,到达使会话失效的目的。作者使用的是CentOS 7的系统:

[jinguang1@localhost wgis]$ lsb_release -a
LSB Version:	:core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd64:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1-noarch
Distributor ID:	CentOS
Description:	CentOS Linux release 7.3.1611 (Core)
Release:	7.3.1611
Codename:	Core

在10.23.3.85服务器上,可以通过iptables来实现丢弃ZooKeeper的交互报文,相应的脚本如下:

#!/bin/bash

iptables -A OUTPUT -d 10.77.16.40 -p tcp --dport 2181 -j DROP
iptables -A OUTPUT -d 10.77.16.60 -p tcp --dport 2181 -j DROP
iptables -A OUTPUT -d 10.77.16.67 -p tcp --dport 2181 -j DROP
iptables -A INPUT -s 10.77.16.67 -p tcp --sport 2181 -j DROP
iptables -A INPUT -s 10.77.16.60 -p tcp --sport 2181 -j DROP
iptables -A INPUT -s 10.77.16.40 -p tcp --sport 2181 -j DROP

上面的脚本,将发往ZooKeeper和来自ZooKeeper的报文进行丢弃,来达到使会话失效的目的。在进行相关配置后,可以观察到ZooKeeper Client相应的日志:

2018-08-21 16:36:18,961:28937(0x7ffb0affd700):ZOO_ERROR@handle_socket_error_msg@1643: Socket [10.77.16.40:2181] zk retcode=-7, errno=110(Connection timed out): connection to 10.77.16.40:2181 timed out (exceeded timeout by 2ms)
2018-08-21 16:36:22,294:28937(0x7ffb0affd700):ZOO_ERROR@handle_socket_error_msg@1643: Socket [10.77.16.60:2181] zk retcode=-7, errno=110(Connection timed out): connection to 10.77.16.60:2181 timed out (exceeded timeout by 0ms)
2018-08-21 16:36:25,628:28937(0x7ffb0affd700):ZOO_ERROR@handle_socket_error_msg@1643: Socket [10.77.16.67:2181] zk retcode=-7, errno=110(Connection timed out): connection to 10.77.16.67:2181 timed out (exceeded timeout by 0ms)

ZooKeeper上相应的临时节点被删除,版本号由83变为84,达到了使会话失效的目的。

[zk: 10.77.16.40:2181(CONNECTED) 23] ls2 /wg/index_server/vertical_70/shard_0
[]
cZxid = 0x206206d5d1c3
ctime = Thu Aug 16 17:00:20 CST 2018
mZxid = 0x206206d5d1c3
mtime = Thu Aug 16 17:00:20 CST 2018
pZxid = 0x2062070df904
cversion = 84
dataVersion = 0
aclVersion = 0
ephemeralOwner = 0x0
dataLength = 0
numChildren = 0

在达到会话失效后,如何恢复网络呢?非常简单,只要执行下列命令删除防火墙的应用即可(更精细化的操作是逐条删除配置的规则):

iptables -F

网络恢复后,ZooKeeper Client重新建立会话:

2018-08-21 17:03:56,671:28937(0x7ffb09ffb700):ZOO_INFO@zookeeper_close@2528: Freeing zookeeper resources for sessionId=0x6068ab7b8d1972

I0821 17:03:57.669409 28946 naming_registry.cc:185] GET FROM ZK zk://10.77.16.40:2181,10.77.16.60:2181,10.77.16.67:2181/weigraph/mutation_proxy/*
I0821 17:03:57.669440 28946 naming_registry.cc:380] ZKClient start to reconnect zookeeper
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.6
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@log_env@716: Client environment:host.name=localhost.localdomain
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@log_env@724: Client environment:os.arch=3.10.0-514.6.2.el7.toa.2.x86_64
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@log_env@725: Client environment:os.version=#1 SMP Tue Oct 31 14:54:31 CST 2017
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@log_env@733: Client environment:user.name=jinguang1
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@log_env@741: Client environment:user.home=/root
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@log_env@753: Client environment:user.dir=/data0/attempt_404_4_2/wgis
2018-08-21 17:03:57,669:28937(0x7ffb1a595700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=10.77.16.40:2181,10.77.16.60:2181,10.77.16.67:2181 sessionTimeout=5000 watcher=0x7a0520 sessionId=0 sessionPasswd=<null> context=0x7ffb0c0008c0 flags=0
[New Thread 0x7ffb09ffb700 (LWP 32334)]
[New Thread 0x7ffb0affd700 (LWP 32335)]
2018-08-21 17:03:57,674:28937(0x7ffb09ffb700):ZOO_INFO@check_events@1705: initiated connection to server [10.77.16.67:2181]
2018-08-21 17:03:57,678:28937(0x7ffb09ffb700):ZOO_INFO@check_events@1752: session establishment complete on server [10.77.16.67:2181], sessionId=0x26068ab52881a66, negotiated timeout=5000

猜你喜欢

转载自blog.csdn.net/u010096900/article/details/81909498