k8s_etcd 安装

一、IP分配情况
192.168.19.31 192.168.19.32 192.168.19.33

二、制作ca证书
注意:证书制作的时候使用的ip为242.31 242.32 242.33 所以需要注意下
链接地址:https://mp.csdn.net/mdeditor/86487168#

三、etcd的安装
1.将证书放到/etc/kubernetes/ssl目录下

# mkdir  -p /etc/kubernetes/ssl
# cp etcd.tar.gz /etc/kubernetes/ssl/
# cd /etc/kubernetes/ssl/
# tar zxvf etcd.tar.gz
# ls
etcd-ca  etcd-cert  etcd-key  etcd.tar.gz

2.下载etcd的二进制文件 etcd 和etcdctl: https://github.com/etcd-io/etcd/releases/
(1)下载相关文件

# 下载解压 并把最后的 etcd 和etcdctl放到/usr/local/bin/
# wget https://github.com/etcd-io/etcd/releases/download/v3.3.11/etcd-v3.3.11-linux-amd64.tar.gz
# cp etcd etcdctl /usr/local/bin/

(2)配置etcd

# mkdir -p /etc/etcd  /var/lib/etcd		#/etc/etcd 放配置文件  /var/lib/etcd 放etcd的数据
# cat /lib/systemd/system/etcd.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

# cat /etc/etcd/etcd.conf		# 配置文件信息
# [member]	
ETCD_NAME="etcd1"
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_SNAPSHOT_COUNT="100000"
ETCD_HEARTBEAT_INTERVAL="100"
ETCD_ELECTION_TIMEOUT="1000"
ETCD_LISTEN_CLIENT_URLS="https://0.0.0.0:2379"
ETCD_LISTEN_PEER_URLS="https://192.168.19.31:2380" 	#本机IP
ETCD_MAX_SNAPSHOTS="5"
ETCD_MAX_WALS="5"

#[cluster]  
#广播给集群内其他成员使用的URL
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.19.31:2380"   #本机IP

#初始集群成员列表  
ETCD_INITIAL_CLUSTER="etcd1=https://192.168.19.31:2380,etcd2=https://192.168.19.32:2380,etcd3=https://192.168.19.33:2380"     

#初始集群状态,new为新建集群,exist为存在
ETCD_INITIAL_CLUSTER_STATE="new"

#集群名称 
ETCD_INITIAL_CLUSTER_TOKEN="K8SETCD"

#广播给外部客户端使用的URL
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.19.31:2379"  #本机IP

# [security] 
ETCD_TRUSTED_CA_FILE="/etc/kubernetes/ssl/etcd-ca"
ETCD_CERT_FILE="/etc/kubernetes/ssl/etcd-cert" 
ETCD_KEY_FILE="/etc/kubernetes/ssl/etcd-key" 
ETCD_CLIENT_CERT_AUTH="true" 

ETCD_PEER_TRUSTED_CA_FILE="/etc/kubernetes/ssl/etcd-ca"
ETCD_PEER_CERT_FILE="/etc/kubernetes/ssl/etcd-cert" 
ETCD_PEER_KEY_FILE="/etc/kubernetes/ssl/etcd-key"
ETCD_PEER_CLIENT_CERT_AUTH="true"

3.启动并设置为开机自启

# systemctl daemon-reload
# systemctl restart etcd	#启动
# systemctl enable etcd		#开机自启

4.注意点
1.在192.168.19.32 的配置文件中 将’new’ 改为‘exist’
2.在配置文件 /etc/etcd/etcd.conf 便有 #本机IP 的地方改为当前虚机IP
2.如果有错误信息
Jan 14 21:26:34 dy1931 etcd: health check for peer 555488f96f68b5a1 could not connect: dial tcp 192.168.19.32:2380: getsockopt: connection refused
Jan 14 21:26:34 dy1931 etcd: health check for peer 93b51556061d4fa7 could not connect: dial tcp 192.168.19.33:2380: getsockopt: connection refused
解决方法: 需要先把 etcd2 etcd3 先启动起来,然后 在启动etcd1

5.命令集
1 查看成员

# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.31:2379 member list
555488f96f68b5a1: name=etcd2 peerURLs=https://192.168.19.32:2380 clientURLs=https://192.168.19.32:2379 isLeader=false
93b51556061d4fa7: name=etcd3 peerURLs=https://192.168.19.33:2380 clientURLs=https://192.168.19.33:2379 isLeader=false
e9818d75d2bf50f8: name=etcd1 peerURLs=https://192.168.19.31:2380 clientURLs=https://192.168.19.31:2379 isLeader=true

2 检查集群健康状态

# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.31:2379 cluster-health
member 555488f96f68b5a1 is healthy: got healthy result from https://192.168.19.32:2379
member 93b51556061d4fa7 is healthy: got healthy result from https://192.168.19.33:2379
member e9818d75d2bf50f8 is healthy: got healthy result from https://192.168.19.31:2379

3 查看所有键值

# etcdctl ...... ls /

四、集群扩容
1.先在虚机192.168.19.34上 安装好etcd,安装步骤同上
2.在etcd集群中(19.31 19.32 19.33)的任意一台执行命令
3.注意:在制作ca证书时,没有新增的节点ip,所以证书需要重新生成,然后下发到集群中的所有机器中

# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.31:2379 member add etcd https://192.168.19.34:2380
Added member named etcd4 with ID 470f0f1a1e61e3 to cluster

ETCD_NAME="etcd4"
ETCD_INITIAL_CLUSTER="etcd4=https://192.168.19.34:2380,etcd2=https://192.168.19.32:2380,etcd3=https://192.168.19.33:2380,etcd1=https://192.168.19.31:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

# 在新机器上启动etcd
# systemctl daemon-reload
# systemctl restart etcd	#启动
# systemctl enable etcd		#开机自启

#由于etcd集群的个数为奇数个,所以再次添加一个节点(192.168.19.35),最终集群状态为
# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.31:2379 member list
555488f96f68b5a1: name=etcd2 peerURLs=https://192.168.19.32:2380 clientURLs=https://192.168.19.32:2379 isLeader=true
8feeaaa463df4ddb: name=etcd5 peerURLs=https://192.168.19.35:2380 clientURLs=https://192.168.19.35:2379 isLeader=false
93b51556061d4fa7: name=etcd3 peerURLs=https://192.168.19.33:2380 clientURLs=https://192.168.19.33:2379 isLeader=false
adaa227121dc5e1a: name=etcd4 peerURLs=https://192.168.19.34:2380 clientURLs=https://192.168.19.34:2379 isLeader=false
e9818d75d2bf50f8: name=etcd1 peerURLs=https://192.168.19.31:2380 clientURLs=https://192.168.19.31:2379 isLeader=false

五、模拟 服务器故障,重新加入一台机器替换坏的机器
1.思路: 由于k8s在apiserver中指定了etcd几群的ip,所以etcd集群的ip不能变
(1)将坏的机器(192.168.19.34)从etcd集群中移除etcdctl member remove …,将其相关的配置拷贝 做备份用
(2)重新找一台虚机(192.168.19.34),在其上安装etcd
(3)将新节点加入到集群中 etcdctl member add …, 并启动

2.实际操作

# 将192.168.19.34 的虚机关闭,查看集群状态 
# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.31:2379 cluster-health
member 555488f96f68b5a1 is healthy: got healthy result from https://192.168.19.32:2379
member 8feeaaa463df4ddb is healthy: got healthy result from https://192.168.19.35:2379
member 93b51556061d4fa7 is healthy: got healthy result from https://192.168.19.33:2379
failed to check the health of member adaa227121dc5e1a on https://192.168.19.34:2379: Get https://192.168.19.34:2379/health: dial tcp 192.168.19.34:2379: getsockopt: connection refused
member adaa227121dc5e1a is unreachable: [https://192.168.19.34:2379] are all unreachable
member e9818d75d2bf50f8 is healthy: got healthy result from https://192.168.19.31:2379
cluster is healthy

# (1)移除坏的节点, 注意根据etcd id 进行移除
# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.31:2379 member remove adaa227121dc5e1a

# (2)在新机器(192.168.19.34)上安装etcd   过程略
# (3)新增节点 并重启新的节点
# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.31:2379 member add etcd4 https://192.168.19.34:2380
#在192.168.19.34重启etcd 后再次查看集群状态, 并且新加入的节点数据会自动同步过去
# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.31:2379 cluster-health
member 7f4693f7608f1b1 is healthy: got healthy result from https://192.168.19.34:2379
member 555488f96f68b5a1 is healthy: got healthy result from https://192.168.19.32:2379
member 8feeaaa463df4ddb is healthy: got healthy result from https://192.168.19.35:2379
member 93b51556061d4fa7 is healthy: got healthy result from https://192.168.19.33:2379
member e9818d75d2bf50f8 is healthy: got healthy result from https://192.168.19.31:2379

六、数据备份 与 恢复

# 备份
# etcdctl --cert-file=/etc/kubernetes/ssl/etcd-cert --key-file=/etc/kubernetes/ssl/etcd-key --ca-file=/etc/kubernetes/ssl/etcd-ca --endpoints=https://192.168.19.34:2379 backup --data-dir=/var/lib/etcd/ --backup-dir=/root/

#恢复  需要在另外一台新的虚机恢复
# etcd --data-dir=/home/etcd_backup/  --force-new-cluster &	#后台启动etcd,并且不会把原来的集群信息带到新的机器中,然后在以这台机器为核心重新搭建一个集群
# pkill -9 etcd				#杀死etcd进程
# systemctl restart etcd	#重启etcd

猜你喜欢

转载自blog.csdn.net/sun_xuegang/article/details/86480696