kubernetes 1.17.3安装部署

本文目的是完成部署kubernestes流程以便之后能更快部署。

参考连接:https://blog.csdn.net/subfate/article/details/103774072

        https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

大部分参考第一篇博客。

一、环境

  两台ubuntu 16.04 64 bit,2GB内存,双核 CPU。
  环境要求和设置:
  两主机,一为 master,一为 node。master 主机名称为 ubuntu。node 主机名称为 node。操作系统的主机名称要确保不同。

  注意,k8s要求机器的CPU必须双核心以上。

  所有操作使用 root 权限执行(注:理论上普通用户亦可,为避免权限问题,故出此下策)。  

二、安装docker

# apt install docker.io

  执行如下命令新建 /etc/docker/daemon.json 文件:

cat > /etc/docker/daemon.json <<-EOF
{
  "registry-mirrors": [
    "https://a8qh6yqv.mirror.aliyuncs.com",
    "http://hub-mirror.c.163.com"
  ],
  "exec-opts": ["native.cgroupdriver=systemd"]
}
EOF

  registry-mirrors 为镜像加速器地址。
  native.cgroupdriver=systemd 表示使用的 cgroup 驱动为 systemd(k8s 使用此方式),默认为 cgroupfs。修改原因是 kubeadm.conf 中修改k8s的驱动方式不成功。

  重启docker,查看 cgroup:

# systemctl restart docker 
# docker info | grep -i cgroup
Cgroup Driver: systemd

三、部署k8s主机

  3.1、关闭swap

# swapoff -a

  3.2、添加国内k8s源(这里使用阿里云镜像)

# cat <<EOF > /etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF

  3.3、更新源

# apt-get update

  添加源之后,使用 apt update 命令会出现错误,原因是缺少相应的key,依次添加相应的key,可以通过下面命令添加(假设其中一个提示缺少key:E084DAB9 ):

# gpg --keyserver keyserver.ubuntu.com --recv-keys E084DAB9
# gpg --export --armor E084DAB9 | apt-key add -

  安装 kubeadm、kubectl、kubelet、kubernetes-cni 等工具。

# apt-get install -y kubeadm kubectl kubelet kubernetes-cni

  注1:安装 kubeadm 会自动安装 kubectl、kubelet 和 kubernetes-cni,故只指定 kubeadm 亦可。

  3.4 获取部署所需的镜像版本并安装

# kubeadm config images list
W0330 10:07:27.234362   12944 version.go:101] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
W0330 10:07:27.234404   12944 version.go:102] falling back to the local client version: v1.17.3
W0330 10:07:27.234491   12944 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0330 10:07:27.234496   12944 validation.go:28] Cannot validate kubelet config - no validator is available
k8s.gcr.io/kube-apiserver:v1.17.3
k8s.gcr.io/kube-controller-manager:v1.17.3
k8s.gcr.io/kube-scheduler:v1.17.3
k8s.gcr.io/kube-proxy:v1.17.3
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.4.3-0
k8s.gcr.io/coredns:1.6.5

  去除"k8s.gcr.io/"的前缀,版本换成kubeadm config images list命令获取到的版本

# images=(
    kube-apiserver:v1.17.3
    kube-controller-manager:v1.17.3
    kube-scheduler:v1.17.3
    kube-proxy:v1.17.3
    pause:3.1
    etcd:3.4.3-0
    coredns:1.6.5
)
# for imageName in ${images[@]} ; do
    docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
    docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName k8s.gcr.io/$imageName
    docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/$imageName
done

  3.5 下载flannel镜像

# docker pull quay.io/coreos/flannel:v0.11.0-amd64

  3.6 初始化

# kubeadm init \
  --apiserver-advertise-address=192.168.2.190 \
  --image-repository registry.aliyuncs.com/google_containers \
  --kubernetes-version v1.17.3 \--pod-network-cidr=10.244.0.0/16

  释义:
    --pod-network-cidr 指定了网络段,后续网络插件会使用到(本文使用 flannel,默认10.244.0.0/16)。
    --image-repository 指定了镜像地址,默认为 k8s.gcr.io,此处指定为阿里云镜像地址 registry.aliyuncs.com/google_containers。(没有该参数时使用默认值)

    --kubernetes-versionr如果安装有其他版本可以指定具体版本号,(没有该参数时使用默认值)

    注意,其它参数默认。

  都是用默认值的启动方式:

# kubeadm init --pod-network-cidr=10.244.0.0/16

  初始化信息如下:

W0330 09:34:29.486559   10623 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0330 09:34:29.486587   10623 validation.go:28] Cannot validate kubelet config - no validator is available
[init] Using Kubernetes version: v1.17.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [dyan-desktop kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.2.190]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [dyan-desktop localhost] and IPs [192.168.2.190 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [dyan-desktop localhost] and IPs [192.168.2.190 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
W0330 09:34:33.826116   10623 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[control-plane] Creating static Pod manifest for "kube-scheduler"
W0330 09:34:33.827805   10623 manifests.go:214] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 26.526203 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.17" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node dyan-desktop as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node dyan-desktop as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: akhepb.ngrxfnvs04qvjfg6
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.2.190:6443 --token akhepb.ngrxfnvs04qvjfg6 \
    --discovery-token-ca-cert-hash sha256:86ee3f4483db21166b18eb733ff812c2305cbdd63037eb5ba6259824f1ba1d9d

  根据提示,根据拷贝 admin.conf 文件到当前用户相应目录下。admin.conf 文件后续会使用到(需要拷贝到 node 节点)。

$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config

  如果后续忘记node加入集群的命令,可 kubeadm token create --print-join-command 查看,示例如下:

# kubeadm token create --print-join-command
W0330 11:08:57.802478   15879 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0330 11:08:57.802505   15879 validation.go:28] Cannot validate kubelet config - no validator is available
kubeadm join 192.168.2.190:6443 --token 1vtzxw.9ypai6p4y0jkp1oz     --discovery-token-ca-cert-hash sha256:86ee3f4483db21166b18eb733ff812c2305cbdd63037eb5ba6259824f1ba1d9d

  此时 pod 状态如下: 

# kubectl get pods -n kube-system
NAME                                   READY   STATUS    RESTARTS   AGE
coredns-6955765f44-bt9xj               0/1     Pending   0          38s
coredns-6955765f44-fpp2m               0/1     Pending   0          38s
etcd-dyan-desktop                      1/1     Running   0          51s
kube-apiserver-dyan-desktop            1/1     Running   0          51s
kube-controller-manager-dyan-desktop   1/1     Running   0          51s
kube-proxy-778f2                       1/1     Running   0          38s
kube-scheduler-dyan-desktop            1/1     Running   0          51s

  除 coredns 状态为 Pending外,其它 pod 均运行。这是因为没有部署网络插件导致的。本文选用 flannel 。

  3.7 部署flannel 

  执行如下命令部署 flannel:

# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

  使用 flannel 仓库的 kube-flannel.yml 文件部署。详细可参考该文件。
  如果无法访问,则可手动下载 https://github.com/coreos/flannel/blob/master/Documentation/kube-flannel.yml 文件到当前目录,再执行 kubectl apply -f kube-flannel.yml 命令。

# kubectl get pods --all-namespaces
kube-system   coredns-6955765f44-bt9xj               0/1     CrashLoopBackOff   39         4h57m
kube-system   coredns-6955765f44-fpp2m               0/1     CrashLoopBackOff   39         4h57m
kube-system   etcd-dyan-desktop                      1/1     Running            0          4h57m
kube-system   kube-apiserver-dyan-desktop            1/1     Running            0          4h57m
kube-system   kube-controller-manager-dyan-desktop   1/1     Running            0          4h57m
kube-system   kube-flannel-ds-amd64-v8frb            1/1     Running            0          177m
kube-system   kube-proxy-778f2                       1/1     Running            0          4h57m
kube-system   kube-scheduler-dyan-desktop            1/1     Running            0          4h57m

  查看该 pod 日志:

# kubectl logs coredns-6955765f44-bt9xj -n kube-system
.:53
[INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7
CoreDNS-1.6.5
linux/amd64, go1.13.4, c2fd1b2
[FATAL] plugin/loop: Loop (127.0.0.1:51523 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 7486309701405418795.2886934462876536096."

  原因是 coredns 的域名解析有问题。修改 coredns 的 ConfigMap: 

kubectl edit cm coredns -n kube-system

  默认使用VIM编辑,删除 loop 字段的那一行(用dd命令)。再输入 :wq 保存退出。

  coredns ConfigMap内容如下:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }   
kind: ConfigMap
metadata:
  creationTimestamp: "2020-03-30T01:35:01Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "187"
  selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
  uid: 9cfc6357-83b6-4233-8fb8-4961594f6a6b

  删除出问题的所有的 coredns:

# kubectl delete pod coredns-6955765f44-bt9xj coredns-6955765f44-fpp2m -n kube-system
pod "coredns-6955765f44-bt9xj" deleted
pod "coredns-6955765f44-fpp2m" deleted

  删除后,coredns 会自动重启。再查看 pod:

# kubectl get pods --all-namespaces
NAMESPACE     NAME                                   READY   STATUS    RESTARTS   AGE
kube-system   coredns-6955765f44-jlkft               1/1     Running   0          42s
kube-system   coredns-6955765f44-wkq5r               1/1     Running   0          42s
kube-system   etcd-dyan-desktop                      1/1     Running   0          5h54m
kube-system   kube-apiserver-dyan-desktop            1/1     Running   0          5h54m
kube-system   kube-controller-manager-dyan-desktop   1/1     Running   0          5h54m
kube-system   kube-flannel-ds-amd64-v8frb            1/1     Running   0          3h54m
kube-system   kube-proxy-778f2                       1/1     Running   0          5h54m
kube-system   kube-scheduler-dyan-desktop            1/1     Running   0          5h54m

  全部 pod 已全部运行。
  注:也可以先修改 ConfigMap,再部署 flannel。

  至此,master 节点已部署成功

 

四、node节点

  4.1 前置条件

    在 node 节点上操作。
    1、安装kubeadm,见前述。
    2、下载flannel镜像,见前述(如果不预先下载,在加入集群时会自动下载)。
    3、将主机的 /etc/kubernetes/admin.conf 文件拷贝到 node 节点的 /etc/kubernetes/ 目录。(注:在 master 节点使用 scp 命令即可,kubernetes 不存在自行创建)

$ scp $HOME/.kube/config kevin@192.168.2.69:/home/kevin/kube.config
kevin@192.168.2.69's password: 
config

    然后就可以在节点上通过这个config访问集群,比如

# kubectl --kubeconfig /home/kevin/kube.config get pods --all-namespaces

  4.2 加入集群

    在master上查看加入集群命令并发送到node节点机器:

# kubeadm token create --print-join-command > kube-join-command.token
W0330 15:47:11.326973   17834 validation.go:28] Cannot validate kube-proxy config - no validator is available
W0330 15:47:11.327001   17834 validation.go:28] Cannot validate kubelet config - no validator is available
# scp kube-join-command.token kevin@192.168.2.69:/home/kevin/
kevin@192.168.2.69's password: 
kube-join-command.token

    在node节点上,此时,k8s服务还没有启动。执行如下命令以加入节点,执行命令加入到集群:

# `cat kube-join-command.token`

    如果之前node加入过其他集群,需要先用kubeadm reset重置。

五、验证

  在master上查看节点情况,可以看到kevin这个node已经加入到集群:

# kubectl get node
NAME                            STATUS   ROLES    AGE     VERSION
dyan-desktop                    Ready    master   6h26m   v1.17.3
kevin-lenovo-tianyi-310-15ikb   Ready    <none>   3m14s   v1.17.3

  使用 busybox 镜像简单测试 pod。在 master 节点执行:

# kubectl run -i --tty busybox --image=latelee/busybox --restart=Never -- sh

  稍等片刻,即可进入 busybox 命令行

  master上另起一个终端,查看 pod 运行状态:  

# kubectl get pod -o wide
NAME      READY   STATUS    RESTARTS   AGE   IP           NODE                            NOMINATED NODE   READINESS GATES
busybox   1/1     Running   0          91s   10.244.2.2   kevin-lenovo-tianyi-310-15ikb   <none>           <none>

  可以看到 pod 为 Running 状态,运行在 node 上。
  在 node 节点上查看:

# docker ps | grep busybox
8cb2f2a1a24d        latelee/busybox        "sh"                     5 minutes ago       Up 5 minutes                            k8s_busybox_busybox_default_aa01c6a5-c12d-4f68-b014-ef338efcb974_0
c9618eaa84d6        k8s.gcr.io/pause:3.1   "/pause"                 7 minutes ago       Up 7 minutes                            k8s_POD_busybox_default_aa01c6a5-c12d-4f68-b014-ef338efcb974_0

  此时在 master 节点退出 busybox, pod 依旧存在,但不是 READY 状态,node 主机也没有 busybox 容器运行。

  验证通过,k8s部署成功

六、其它

  6.1、重置k8s

# kubeadm reset

    执行如下命令清除目录、删除网络设备:

rm -rf $HOME/.kube/config
rm -rf /var/lib/cni/
rm -rf /var/lib/kubelet/*
rm -rf /etc/kubernetes/
rm -rf /etc/cni/
ifconfig cni0 down
ifconfig flannel.1 down
ip link delete cni0
ip link delete flannel.1

  6.2、节点机器退出

    在 master 上执行:
    1、退出节点:

# kubectl drain kevin-lenovo-tianyi-310-15ikb
node/kevin-lenovo-tianyi-310-15ikb cordoned
evicting pod "busybox"
pod/busybox evicted
node/kevin-lenovo-tianyi-310-15ikb evicted

    再查看节点:

# kubectl get node
NAME                            STATUS                     ROLES    AGE     VERSION
dyan-desktop                    Ready                      master   6h52m   v1.17.3
kevin-lenovo-tianyi-310-15ikb   Ready,SchedulingDisabled   <none>   29m     v1.17.3

    kevin节点已经变成不可调度状态了,但还是保持 Ready 状态(因为原本就是此状态)。可以理解为“禁止该节点的使用”。可以使用uncordon让节点重新变得可用:

# kubectl uncordon kevin-lenovo-tianyi-310-15ikb
node/kevin-lenovo-tianyi-310-15ikb uncordoned
# kubectl get node
NAME                            STATUS   ROLES    AGE   VERSION
dyan-desktop                    Ready    master   25h   v1.17.3
kevin-lenovo-tianyi-310-15ikb   Ready    <none>   19h   v1.17.3

    2、删除节点:

# kubectl delete node kevin-lenovo-tianyi-310-15ikb
node "kevin-lenovo-tianyi-310-15ikb" deleted   

    再查看已无 node 节点。

    此时 node 节点的 flannel、kube-proxy没有在运行:

# ps aux | grep kube          
root       3269  1.6  4.3 754668 88712 ?        Ssl  Dec20  18:54 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --cgroup-driver=cgroupfs --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.1
root     124216  0.0  0.0  14228   964 pts/0    R+   00:49   0:00 grep --color=auto kube

    在 node 上执行:

# kubeadm reset

    执行如下命令清除目录、删除网络设备(注:与 master 有类似但又不同):

ifconfig cni0 down
ip link delete cni0
ifconfig flannel.1 down
ip link delete flannel.1
rm /var/lib/cni/ -rf
rm /etc/kubernetes/ -rf
rm /var/lib/kubelet/ -rf  

  

猜你喜欢

转载自www.cnblogs.com/dyan1024/p/12604665.html