前言:
kubernetes集群的部署工作是比较繁琐的,但kubeadm使得急速部署集群成为了一种可能,离线化的部署可以提高部署的效率,使得网络(各种镜像经常下载不了或者下载缓慢)不是部署工作的瓶颈。
OK,下面就讲解一哈如何利用kubeadm急速部署一个简单的可用于测试的kubernetes集群(如果对linux比较熟练的话,可以在5分钟内就部署完成)。
一,本次实践的服务器以及需要安装的组件的情况说明
计划使用三台VMware虚拟机做这个集群,当然,物理机也是一样的道理,在此就不啰嗦了。
服务器IP地址 | 操作系统 | 硬件配置 | 系统内核版本 | 安装的组件 |
192.168.217.19 | CentOS Linux release 7.4.1708 (Core) | 2核2c 内存4G 磁盘空间 100G |
Linux master 5.16.9-1.el7.elrepo.x86_64 | 集群角色:master,安装kubeadm和kubelet,docker环境,docker的版本为ce 20.10.5 |
192.168.217.20 | CentOS Linux release 7.4.1708 (Core) | 2核2c 内存4G 磁盘空间 100G |
Linux master 5.16.9-1.el7.elrepo.x86_64 | 集群角色:node,安装kubelet,docker环境,docker的版本为ce 20.10.5 |
192.168.217.21 | CentOS Linux release 7.4.1708 (Core) | 2核2c 内存4G 磁盘空间 100G |
Linux master 5.16.9-1.el7.elrepo.x86_64 | 集群角色:node,安装kubelet,docker环境,docker的版本为ce 20.10.5 |
二,
先决条件:
A,时间服务器
时间服务器的用途就不多说了,必须要有的一个重要组件。
三个节点都执行命令:
yum install ntp -y && systemctl enable ntpd
19服务器的/etc/ntp.conf 配置文件内:
server 127.127.1.0 prefer
fudge 127.127.1.0 stratum 10
20和21服务器的/etc/ntp.conf配置文件内:
server 192.168.217.19
三台服务器都执行命令,以重启时间服务器:
systemctl restart ntpd
以上的配置表明以19服务器为主时间服务器,其它节点与此服务器时间同步,在工作节点上执行以下命令,输出为此,表示时间服务器正常:
[root@node2 ~]# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
*master LOCAL(0) 11 u 50 256 377 0.517 0.024 0.076
[root@node2 ~]# ntpstat
synchronised to NTP server (192.168.217.19) at stratum 12
time correct to within 25 ms
polling server every 256 s
B,集群服务器的免 密码ssh
服务器之间的免密就不在此啰嗦了,实在是太基础的东西了。
C,集群服务器的防火墙关闭以及swap交换内存关闭,selinux关闭
关闭防火墙(三台服务器都执行):
systemctl disable firewalld && systemctl stop firewalld
关闭swap交互内存:
swapoff -a
selinux的关闭就不说了,直接看测试结果,如下表示已关闭:
[root@node2 ~]# getenforce
Disabled
D,主机名的固定,三台服务器统一这个hosts
[root@node2 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.217.19 master k8s-master
192.168.217.20 node1 k8s-node1
192.168.217.21 node2 k8s-node2
F,本地yum仓库以及基础软件
本地仓库的搭建见我的博客:Linux的完全本地仓库搭建指南(科普扫盲贴)_晚风_END的博客-CSDN博客_linux创建仓库
G:docker环境的搭建
docker环境的搭建见我的博客:docker的离线安装以及本地化配置_晚风_END的博客-CSDN博客
三,
上传离线安装包到服务器内,离线安装包如下:
[root@master ~]# ll
total 3016
-rw-r--r-- 1 root root 3069556 Oct 22 21:28 flannel #网络插件
drwxr-xr-x 3 root root 4096 Oct 22 21:10 k8s-offline-rpm #rpm安装包,这个需要挂载为仓库
drwxr-xr-x 2 root root 4096 Oct 22 21:11 kubeadin-offline-image #docker镜像
-rw-r--r-- 1 root root 4813 Oct 22 21:18 kube-flannel.yml #flannel的部署清单文件
#以上文件都放置在root目录下,node节点只需要前面三个,不需要kube-flannel.yml 文件
本地仓库的开启:
cat > /etc/yum.repos.d/k8s.repo <<EOF
[k8s]
name=k8s
baseurl=file:///root/k8s-offline-rpm
enable=1
gpgcheck=0
EOF
导入离线的docker镜像:
cd kubeadin-offline-image
for i in `ls /root/kubeadin-offline-image`;do docker load -i $i;done
赋予flannel插件的可执行权限:
chmod a+x flannel
安装相关软件:
yum install -y kubeadm-1.22.2 kubelet-1.22.2 kubectl-1.22.2 conntrack-tools libseccomp \
libtool-ltdl device-mapper-persistent-data lvm2
四,集群初始化以及工作节点加入
方式一----命令初始化:
因为前面yum下载的是1.22.2,因此,这里的版本也指定的是1.22.2,apiserver-advertise-address这里是master服务器的IP
kubeadm init \
--apiserver-advertise-address=192.168.217.19 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.22.2 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16
初始化命令的输出如下:
[init] Using Kubernetes version: v1.22.2
[preflight] Running pre-flight checks
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/etc/kubernetes/pki" #kubeadm自动生成相关证书
[certs] Generating "ca" certificate and key#kubeadm自动生成相关证书
[certs] Generating "apiserver" certificate and key#kubeadm自动生成相关证书
[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master] and IPs [10.96.0.1 192.168.217.19]#kubeadm自动生成相关证书,这里提示了DNS是10.96.0.1
[certs] Generating "apiserver-kubelet-client" certificate and key#kubeadm自动生成相关证书
[certs] Generating "front-proxy-ca" certificate and key#kubeadm自动生成相关证书
[certs] Generating "front-proxy-client" certificate and key#kubeadm自动生成相关证书
[certs] Generating "etcd/ca" certificate and key#kubeadm自动生成相关证书
[certs] Generating "etcd/server" certificate and key#kubeadm自动生成相关证书
[certs] etcd/server serving cert is signed for DNS names [localhost master] and IPs [192.168.217.19 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key#kubeadm自动生成相关证书,etcd的证书
[certs] etcd/peer serving cert is signed for DNS names [localhost master] and IPs [192.168.217.19 127.0.0.1 ::1]#kubeadm自动生成相关证书,etcd的证书
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"#使用此目录下的配置文件
[kubeconfig] Writing "admin.conf" kubeconfig file#生成配置文件
[kubeconfig] Writing "kubelet.conf" kubeconfig file#生成kubelet的相关配置文件
[kubeconfig] Writing "controller-manager.conf" kubeconfig file#生成controller的配置文件
[kubeconfig] Writing "scheduler.conf" kubeconfig file#生成schedule的配置文件
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"#kubelet的配置文件
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"#还是kubelet的配置文件
[kubelet-start] Starting the kubelet#启动kubelet服务
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"#启动静态pod kube-apiserver
[control-plane] Creating static Pod manifest for "kube-controller-manager"#启动静态pod controller-manager
[control-plane] Creating static Pod manifest for "kube-scheduler"#启动静态pod
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"#启动静态pod
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 13.504364 seconds#健康检查完毕
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace#使用cm文件
[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster#创建一个kubelet的cm
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node master as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]#给节点打了标签
[mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] #给master节点打了一个污点
[bootstrap-token] Using token: b1zldq.89t1aea8szja9d7l #token的使用
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes #rbac系统建立
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials#rbac系统建立
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token #rbac系统建立
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace #使用cm保存集群信息
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key #指定kubelet的证书
[addons] Applied essential addon: CoreDNS #部署组件coredns
[addons] Applied essential addon: kube-proxy#部署组件kube-proxy
Your Kubernetes control-plane has initialized successfully!#控制盘已经建立完成
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube#固化环境变量
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf#临时环境变量
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
#建议你apply一个测试文件,做集群的测试工作
Then you can join any number of worker nodes by running the following on each as root:
#工作节点加入命令
kubeadm join 192.168.217.19:6443 --token b1zldq.89t1aea8szja9d7l \
--discovery-token-ca-cert-hash sha256:6ac4ccaf392e4173b7fd9c09cebfd0e2d7eb5ff5a826f39409701fe012ad2ba4
方式二---配置文件初始化:
此文件名称为kubeadm-init.yaml,可以通过kubeadm 命令生成模板文件,命令为:
kubeadm config print init-defaults > kubeadm-init.yaml
此模板文件需要修改8个地方:
- ttl: 24h0m0s 修改为ttl: "0" 这样初始化的token不会过期
- advertiseAddress: 1.2.3.4 修改为advertiseAddress: 192.168.217.19 这个IP是master节点的IP
- name;node 修改为name:master 也就是修改为master节点的主机名,我的master节点的主机名是master
- dns: {} 修改为dns: type: CoreDNS 指定集群的DNS类型
- imageRepository: k8s.gcr.io修改为阿里云的镜像站点---registry.aliyuncs.com/google_containers,这样可以提高下载速度,也就是镜像的本地化
- podSubnet: "10.244.0.0/16" 这个是增加的,原模板文件里没有这个,等同于设置apiserver里的--pod-network-cidr,这个网段是pod使用的。
- serviceSubnet: "" 这里可以不设置,默认就是10.96.0.0/12 此网段是service这个资源使用的。
- kubernetesVersion: 1.22.2 这个是kubeadm,kubelet的版本号
版本号的查询:
[root@master ~]# kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:37:34Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
初始化config配置文件示例---kubeadm-init.yaml:
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: "0"
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.217.19
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
imagePullPolicy: IfNotPresent
name: master
taints: null
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta3
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcds
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: 1.22.2
networking:
dnsDomain: cluster.local
podSubnet: "10.244.0.0/16"
serviceSubnet: ""
scheduler: {}
使用配置文件初始化集群:
kubeadm init --config=kubeadm-init.yaml
工作节点的加入:
方式一---命令行增加工作节点在20和21节点,执行此命令:
kubeadm join 192.168.217.19:6443 --token b1zldq.89t1aea8szja9d7l \
--discovery-token-ca-cert-hash sha256:6ac4ccaf392e4173b7fd9c09cebfd0e2d7eb5ff5a826f39409701fe012ad2ba4
此命令输出如下:
[root@node1 ~]# kubeadm join 192.168.217.19:6443 --token b1zldq.89t1aea8szja9d7l \
> --discovery-token-ca-cert-hash sha256:6ac4ccaf392e4173b7fd9c09cebfd0e2d7eb5ff5a826f39409701fe012ad2ba4
[preflight] Running pre-flight checks
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
方式二---kubeadm config 配置文件方式增加工作节点
生成join工作节点的模板文件:
kubeadm config print join-defaults >kubeadm-join.yaml
编辑文件kubeadm-join.yaml,如下地方需要修改:
- apiServerEndpoint:连接apiserver的地址,即master的api地址,这里可以改为192.168.0.1:6443,如果master集群部署的话,这里需要改为集群vip地址
- token及tlsBootstrapToken:连接master使用的token,这里需要与master上的InitConfiguration中的token配置一致
- name:node节点的名称,如果使用主机名,需要确保master节点可以解析该主机名。否则的话可直接使用ip地址
本例中的kubeadm-join.yaml
token和tlsBootstrapToken 的值都是上面init命令输出的最后一行的token,此文件在工作节点运行,运行命令为:
kubeadm join --config=kubeadm-join.yaml
示例文件内容如下:
apiVersion: kubeadm.k8s.io/v1beta3
caCertPath: /etc/kubernetes/pki/ca.crt
kind: JoinConfiguration
discovery:
bootstrapToken:
apiServerEndpoint: 192.168.217.19:6443
token: b1zldq.89t1aea8szja9d7l
unsafeSkipCAVerification: true
t1sBootstrapToken: b1zldq.89t1aea8szja9d7l
五,极为简单的网络插件部署
在主节点master执行:
kubectl apply -f kube-flannel.yml
chmod a+x flannel
cp flannel /opt/cni/bin/
scp flannel node1:/opt/cni/bin/
scp flannel node2:/opt/cni/bin/
在三个节点都重启kubelet服务:
systemctl restart kubelet
此时,在master节点查询节点状态:
[root@master ~]# kubectl get no
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 26m v1.22.2
node1 Ready <none> 25m v1.22.2
node2 Ready <none> 25m v1.22.2
六,集群的一个小bug修复及集群的功能测试
小bug修复:
编辑文件/etc/kubernetes/manifests/kube-controller-manager.yaml 删除- --port=0 这一行
编辑文件/etc/kubernetes/manifests/kube-scheduler.yaml 删除- --port=0 这一行
重启kubelet服务:systemctl restart kubelet
看看集群的健康状态:
[root@master ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
etcd-0 Healthy {"health":"true","reason":""}
scheduler Healthy ok
看看集群的pod是否正常,service的clusterIP是否正常:
[root@master ~]# kubectl get po,svc -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-7f6cbbb7b8-dcnpf 1/1 Running 0 109m
kube-system pod/coredns-7f6cbbb7b8-hg5t8 1/1 Running 0 109m
kube-system pod/etcd-master 1/1 Running 0 109m
kube-system pod/kube-apiserver-master 1/1 Running 0 109m
kube-system pod/kube-controller-manager-master 1/1 Running 0 56s
kube-system pod/kube-flannel-ds-22k8b 1/1 Running 0 94m
kube-system pod/kube-flannel-ds-mgvsj 1/1 Running 0 94m
kube-system pod/kube-flannel-ds-v8ml5 1/1 Running 0 94m
kube-system pod/kube-proxy-hstwd 1/1 Running 0 107m
kube-system pod/kube-proxy-sqmfq 1/1 Running 0 107m
kube-system pod/kube-proxy-z2cmx 1/1 Running 0 109m
kube-system pod/kube-scheduler-master 1/1 Running 0 111s
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 109m
kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 109m
生成一个nginx的pod,看此pod能否正常部署:
[root@master ~]# kubectl create deploy nginx --image=nginx:1.20
deployment.apps/nginx created
[root@master ~]# kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-7fb9867-ssqsr 0/1 ContainerCreating 0 9s
[root@master ~]# kubectl get po
NAME READY STATUS RESTARTS AGE
nginx-7fb9867-ssqsr 0/1 ContainerCreating 0 11s
总结:
此次实践需要指出的是,这种方式部署的kubernetes集群是只能做测试用的,因为,etcd只是单例,不是高可用集群,apiserver也不是ha高可用。后续会给出一个可用于生产的高可用kubeadm版本集群。
附:
在线安装kubeadm方式部署的kubernetes集群:
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
其它的步骤基本一样,没有什么需要改的,只是在线的方式会比较慢,因为镜像需要一个个慢慢下载,前面的初始化集群命令那的kubernetes的版本需要更改,例如:
yum 安装的是kubernetes-1.23.9:
yum install -y kubeadm-1.23.9 kubelet-1.23.9 kubectl-1.23.9 conntrack-tools libseccomp \
libtool-ltdl device-mapper-persistent-data lvm2
那么,初始化命令需要修改版本号为:
kubeadm init \
--apiserver-advertise-address=192.168.217.19 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.23.9 \
--service-cidr=10.96.0.0/12 \
--pod-network-cidr=10.244.0.0/16