kubernetes安装-二进制

主要参考https://github.com/opsnull/follow-me-install-kubernetes-cluster,采用Flanel和docker

系统信息

角色 系统 CPU Core 内存 主机名称 ip 安装组件
master 18.04.1-Ubuntu 4 8G master 192.168.0.107 kubectl,kube-apiserver,kube-controller-manager,kube-scheduler,etcd,flannald
slave 18.04.1-Ubuntu 4 4G slave 192.168.0.114 docker,flannald,kubelet,kube-proxy,coredns

k8s&docker版本

软件 版本
k8s 1.17.2
etcd v3.3.18
coredns 1.6.6(docker镜像)
Flanel v0.11.0
docker 18.09

安装前准备(主节点和从节点都需要执行)

  1. 关闭swap

    sudo swapoff -a 
    
  2. 配置常用软件安装源
    在/etc/apt/sources.list.d/ 追加system.list文件,内容如下

    deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted  
    deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted  
    deb http://mirrors.aliyun.com/ubuntu/ bionic universe  
    deb http://mirrors.aliyun.com/ubuntu/ bionic-updates universe  
    deb http://mirrors.aliyun.com/ubuntu/ bionic multiverse  
    deb http://mirrors.aliyun.com/ubuntu/ bionic-updates multiverse  
    deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
    

    执行

    sudo apt-get update
    
  3. 创建工作目录

    mkdir -p /opt/k8s/{bin,work} /etc/{kubernetes,etcd}/cert
    
  4. 将 /opt/k8s/bin追加到$PATH中

    echo 'PATH=/opt/k8s/bin:$PATH' >>/root/.bashrc
    source /root/.bashrc
    
  5. 安装ssh服务,并设置root可以执行

    apt install openssh-server
    
    #编辑/etc/ssh/sshd_config文件,在#PermitRootLogin prohibit-password下追加PermitRootLogin yes ,重启ssh服务
    
    systemctl restart ssh.service
    
  6. 安装依赖工具包

    apt install -y ipvsadm ipset curl jq
    
  7. 设置主机名

    cat >> /etc/hosts <<EOF
    192.168.0.107 master
    192.168.0.114 slave
    EOF
    
  8. 添加节点信任关系,只用在master节点上执行

    ssh-keygen -t rsa 
    ssh-copy-id [email protected]
    

创建CA根证书和秘钥(在master节点上执行)

  1. 安装cfssl工具集

    cd /opt/k8s/work
    
    wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssl_1.4.1_linux_amd64
    cp cfssl_1.4.1_linux_amd64 /opt/k8s/bin/cfssl
    
    wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssljson_1.4.1_linux_amd64
    cp cfssljson_1.4.1_linux_amd64 /opt/k8s/bin/cfssljson
    
    wget https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssl-certinfo_1.4.1_linux_amd64
    cp cfssl-certinfo_1.4.1_linux_amd64 /opt/k8s/bin/cfssl-certinfo
    
    chmod +x /opt/k8s/bin/*
    
    
  2. 创建CA配置文件

    cd /opt/k8s/work
    cat > ca-config.json <<EOF
    {
      "signing": {
        "default": {
          "expiry": "87600h"
        },
        "profiles": {
          "kubernetes": {
            "usages": [
                "signing",
                "key encipherment",
                "server auth",
                "client auth"
            ],
            "expiry": "876000h"
          }
        }
      }
    }
    EOF
    • signing:表示该证书可用于签名其它证书(生成的 ca.pem 证书中 CA=TRUE);
    • server auth:表示 client 可以用该该证书对 server 提供的证书进行验证;
    • client auth:表示 server 可以用该该证书对 client 提供的证书进行验证;
    • expiry : "87600h":证书有效期设置为 10 年;
  3. 创建证书签名请求文件

    cd /opt/k8s/work
    cat > ca-csr.json <<EOF
    {
      "CN": "kubernetes",
      "key": {
        "algo": "rsa",
        "size": 2048
      },
      "names": [
        {
          "C": "CN",
          "ST": "NanJing",
          "L": "NanJing",
          "O": "k8s",
          "OU": "system"
        }
      ],
      "ca": {
        "expiry": "87600h"
     }
    }
    EOF
  4. 生成证书

    cd /opt/k8s/work
    cfssl gencert -initca ca-csr.json | cfssljson -bare ca
    ls ca*
  5. 安装证书

    cd /opt/k8s/work
    
    cp ca*.pem ca-config.json /etc/kubernetes/cert
    
    # 分发到从节点
    export node_ip=192.168.0.114
    scp ca*.pem ca-config.json root@${node_ip}:/etc/kubernetes/cert/
    

部署 etcd(在master节点上执行)

  1. 下载安装etcd

    cd /opt/k8s/work
    wget https://github.com/etcd-io/etcd/releases/download/v3.3.18/etcd-v3.3.18-linux-amd64.tar.gz
    tar -xvf etcd-v3.3.18-linux-amd64.tar.gz
  2. 安装etcd

    cd /opt/k8s/work
    
    cp etcd-v3.3.18-linux-amd64/etcd* /opt/k8s/bin/
    chmod +x /opt/k8s/bin/*
    
  3. 创建 etcd 证书和私钥
    1. 创建证书签名请求文件

      
      cd /opt/k8s/work
      cat > etcd-csr.json <<EOF
      {
        "CN": "etcd",
        "hosts": [
          "127.0.0.1",
          "192.168.0.107"
        ],
        "key": {
          "algo": "rsa",
          "size": 2048
        },
        "names": [
          {
            "C": "CN",
            "ST": "NanJing",
            "L": "NanJing",
            "O": "k8s",
            "OU": "system"
          }
        ]
      }
      EOF
      
      • 指定授权使用该证书的 etcd 节点 IP 列表
    2. 生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
          -ca-key=/opt/k8s/work/ca-key.pem \
          -config=/opt/k8s/work/ca-config.json \
          -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
      ls etcd*pem
      
    3. 安装证书

      cd /opt/k8s/work
      cp etcd*.pem /etc/etcd/cert/
  4. 创建etcd启动文件

    cat> /etc/systemd/system/etcd.service<< EOF
    [Unit]
    Description=Etcd Server
    After=network.target
    After=network-online.target
    Wants=network-online.target
    Documentation=https://github.com/coreos
    
    [Service]
    Type=notify
    WorkingDirectory=/data/k8s/etcd/data
    ExecStart=/opt/k8s/bin/etcd \\
      --data-dir=/etc/etcd/cfg/etcd \\
      --name=etcd-chengf \\
      --cert-file=/etc/etcd/cert/etcd.pem \\
      --key-file=/etc/etcd/cert/etcd-key.pem \\
      --trusted-ca-file=/etc/kubernetes/cert/ca.pem \\
      --peer-cert-file=/etc/etcd/cert/etcd.pem \\
      --peer-key-file=/etc/etcd/cert/etcd-key.pem \\
      --peer-trusted-ca-file=/etc/kubernetes/cert/ca.pem \\
      --peer-client-cert-auth \\
      --client-cert-auth \\
      --listen-peer-urls=https://192.168.0.107:2380 \\
      --initial-advertise-peer-urls=https://192.168.0.107:2380 \\
      --listen-client-urls=https://192.168.0.107:2379,http://127.0.0.1:2379 \\
      --advertise-client-urls=https://192.168.0.107:2379 \\
      --initial-cluster-token=etcd-cluster-0\\
      --initial-cluster=etcd-chengf=https://192.168.0.107:2380 \\
      --initial-cluster-state=new \\
      --auto-compaction-mode=periodic \\
      --auto-compaction-retention=1 \\
      --max-request-bytes=33554432 \\
      --quota-backend-bytes=6442450944 \\
      --heartbeat-interval=250 \\
      --election-timeout=2000
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
    • WorkingDirectory、--data-dir:指定工作目录和数据目录,需在启动服务前创建这个目录;
    • --name:指定节点名称,当 --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;
    • --cert-file、--key-file:etcd server 与 client 通信时使用的证书和私钥;
    • --trusted-ca-file:签名 client 证书的 CA 证书,用于验证 client 证书;
    • --peer-cert-file、--peer-key-file:etcd 与 peer 通信使用的证书和私钥;
    • --peer-trusted-ca-file:签名 peer 证书的 CA 证书,用于验证 peer 证书;
  5. 创建etcd数据目录

    mkdir -p /data/k8s/etcd/data
  6. 启动 etcd 服务

    systemctl enable etcd && systemctl start etcd
    
  7. 检查启动结果

    systemctl status etcd|grep Active
    • 确保状态为 active (running),否则查看日志,确认原因
    • 如果出现异常,通过如下命令查看

      journalctl -u etcd
  8. 验证服务状态

    export ETCD_ENDPOINTS=https://192.168.0.107:2379
    
    etcdctl \
    --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/cert/ca.pem \
    --cert-file=/etc/etcd/cert/etcd.pem \
    --key-file=/etc/etcd/cert/etcd-key.pem cluster-health
    etcdctl \
    --endpoints=${ETCD_ENDPOINTS} \
    --ca-file=/etc/kubernetes/cert/ca.pem \
    --cert-file=/etc/etcd/cert/etcd.pem \
    --key-file=/etc/etcd/cert/etcd-key.pem member list

    输出结果

    root@master:/opt/k8s/work# etcdctl     --endpoints=${ETCD_ENDPOINTS}     --ca-file=/etc/kubernetes/cert/ca.pem     --cert-file=/etc/etcd/cert/etcd.pem     --key-file=/etc/etcd/cert/etcd-key.pem cluster-health
    member c0d3b56a9878e38f is healthy: got healthy result from https://192.168.0.107:2379
    cluster is healthy
    root@master:/opt/k8s/work# etcdctl     --endpoints=${ETCD_ENDPOINTS}     --ca-file=/etc/kubernetes/cert/ca.pem     --cert-file=/etc/etcd/cert/etcd.pem     --key-file=/etc/etcd/cert/etcd-key.pemmember list
    c0d3b56a9878e38f: name=etcd-chengf peerURLs=https://192.168.0.107:2380 clientURLs=https://192.168.0.107:2379 isLeader=true

部署 flannel 网络(在master节点上执行)

kubernetes组件kubelet服务依赖docker服务,docker网络需要用flannel来配置docker0网桥的ip地址,所以需要先安装flannel网络组建

flannel 使用 vxlan 技术为各节点创建一个可以互通的 Pod 网络,使用的端口为 UDP 8472(需要开放该端口,如公有云 AWS 等)。

flanneld 第一次启动时,从 etcd 获取配置的 Pod 网段信息,为本节点分配一个未使用的地址段,然后创建 flannedl.1 网络接口(也可能是其它名称,如 flannel1 等)。

flannel 将分配给自己的 Pod 网段信息写入 /run/flannel/docker 文件,docker 后续使用这个文件中的环境变量设置 docker0 网桥,从而从这个地址段为本节点的所有 Pod 容器分配 IP

  1. 下载和安装flanneld 二进制文件

    
    cd /opt/k8s/work
    mkdir flannel
    wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz
    tar -xzvf flannel-v0.11.0-linux-amd64.tar.gz -C flannel
    
    cp flannel/{flanneld,mk-docker-opts.sh} /opt/k8s/bin/
    
    export node_ip=192.168.0.114
    scp flannel/{flanneld,mk-docker-opts.sh} root@${192.168.0.114}:/opt/k8s/bin/
  2. 创建 flanneld 证书和私钥

    flanneld 从 etcd 集群存取网段分配信息,而 etcd 集群启用了双向 x509 证书认证,所以需要为 flanneld 生成证书和私钥。

    1. 创建证书签名请求

      cd /opt/k8s/work
      cat > flanneld-csr.json <<EOF
      {
        "CN": "flanneld",
        "hosts": [],
        "key": {
          "algo": "rsa",
          "size": 2048
        },
        "names": [
          {
            "C": "CN",
            "ST": "NanJing",
            "L": "NanJing",
            "O": "k8s",
            "OU": "system"
          }
        ]
      }
      EOF
      
    2. 生成证书和私钥

      cfssl gencert -ca=/opt/k8s/work/ca.pem \
        -ca-key=/opt/k8s/work/ca-key.pem \
        -config=/opt/k8s/work/ca-config.json \
        -profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld
      ls flanneld*pem
    3. 将生成的证书和私钥分发到所有节点

      cd /opt/k8s/work
      mkdir -p /etc/flanneld/cert
      cp flanneld*.pem /etc/flanneld/cert
      
      export node_ip=192.168.0.114
      ssh root@${node_ip} "mkdir -p /etc/flanneld/cert"
      scp flanneld*.pem root@${node_ip}:/etc/flanneld/cert
      
  3. 向 etcd 写入集群 Pod 网段信息

    cd /opt/k8s/work
    
    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    export ETCD_ENDPOINTS="https://192.168.0.107:2379"
    
    etcdctl \
      --endpoints=${ETCD_ENDPOINTS} \
      --ca-file=/opt/k8s/work/ca.pem \
      --cert-file=/opt/k8s/work/flanneld.pem \
      --key-file=/opt/k8s/work/flanneld-key.pem \
      mk ${FLANNEL_ETCD_PREFIX}/config '{"Network":"172.30.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}'
    
    • 写入的 Pod 网段 Network 网络段对应的数值(如 /16)必须小于 SubnetLen对应的值(如24)
  4. 创建 flanneld 服务的启动文件

    
    cd /opt/k8s/work
    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    export ETCD_ENDPOINTS="https://192.168.0.107:2379"
    
    cat > flanneld.service << EOF
    [Unit]
    Description=Flanneld overlay address etcd agent
    After=network.target
    After=network-online.target
    Wants=network-online.target
    After=etcd.service
    Before=docker.service
    
    [Service]
    Type=notify
    ExecStart=/opt/k8s/bin/flanneld \\
      -etcd-cafile=/etc/kubernetes/cert/ca.pem \\
      -etcd-certfile=/etc/flanneld/cert/flanneld.pem \\
      -etcd-keyfile=/etc/flanneld/cert/flanneld-key.pem \\
      -etcd-endpoints=${ETCD_ENDPOINTS} \\
      -etcd-prefix=${FLANNEL_ETCD_PREFIX} \\
      -ip-masq
    ExecStartPost=/opt/k8s/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    
    [Install]
    WantedBy=multi-user.target
    RequiredBy=docker.service
    EOF
    
    • mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网段信息,通过-d参数写入 /run/flannel/docker 文件,后续 docker 启动时使用这个文件中的环境变量配置 docker0 网桥, -k 参数控制生成文件中变量的名称,下面docker启动时会用到这个变量;
    • flanneld 使用系统缺省路由所在的接口与其它节点通信,对于有多个网络接口(如内网和公网)的节点,可以用 -iface 参数指定通信接口;
    • -ip-masq: flanneld 为访问 Pod 网络外的流量设置 SNAT 规则,同时将传递给 Docker 的变量 --ip-masq(/run/flannel/docker 文件中)设置为 false,这样 Docker 将不再创建 SNAT 规则; Docker 的 --ip-masq 为 true 时,创建的 SNAT 规则比较“暴力”:将所有本节点 Pod 发起的、访问非 docker0 接口的请求做 SNAT,这样访问其他节点 Pod 的请求来源 IP 会被设置为 flannel.1 接口的 IP,导致目的 Pod 看不到真实的来源 Pod IP。 flanneld 创建的 SNAT 规则比较温和,只对访问非 Pod 网段的请求做 SNAT
  5. 分发flanneld服务

    cd /opt/k8s/work
    
    cp flanneld.service /etc/systemd/system/
    
    export node_ip=192.168.0.114
    scp flanneld.service root@${node_ip}:/etc/systemd/system/
    
  6. 启动flanneld服务

    systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld
    
    ssh root@${node_ip) "systemctl daemon-reload && systemctl enable flanneld && systemctl restart flanneld"
    
  7. 检查启动结果

    systemctl status flanneld|grep Active
    
    export node_ip=192.168.0.114
    ssh root@${node_ip} "systemctl status flanneld|grep Active"
    • 确保状态为 active (running),否则查看日志,确认原因
    • 如果出现异常,通过如下命令查看

      journalctl -u flanneld
  8. 检查分配给各 flanneld 的 Pod 网段信息

    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    export ETCD_ENDPOINTS="https://192.168.0.107:2379"
    
    
    etcdctl \
      --endpoints=${ETCD_ENDPOINTS} \
      --ca-file=/etc/kubernetes/cert/ca.pem \
      --cert-file=/etc/flanneld/cert/flanneld.pem \
      --key-file=/etc/flanneld/cert/flanneld-key.pem \
      get ${FLANNEL_ETCD_PREFIX}/config

    输出结果

    {"Network":"172.30.0.0/16", "SubnetLen": 24, "Backend": {"Type": "vxlan"}}
  9. 查看已分配的 Pod 子网段列表

    export FLANNEL_ETCD_PREFIX="/kubernetes/network"
    export ETCD_ENDPOINTS="https://192.168.0.107:2379"
    
    etcdctl \
      --endpoints=${ETCD_ENDPOINTS} \
      --ca-file=/etc/kubernetes/cert/ca.pem \
      --cert-file=/etc/flanneld/cert/flanneld.pem \
      --key-file=/etc/flanneld/cert/flanneld-key.pem \
      ls ${FLANNEL_ETCD_PREFIX}/subnets

    输出结果

    /kubernetes/network/subnets/172.30.22.0-24
    /kubernetes/network/subnets/172.30.78.0-24
  10. 检查节点 flannel 网络信息

    root@master:/opt/k8s/work# ip addr show
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host
           valid_lft forever preferred_lft forever
    2: enp2s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN group default qlen 1000
        link/ether 04:92:26:13:92:2b brd ff:ff:ff:ff:ff:ff
    3: wlp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
        link/ether d0:c5:d3:57:73:01 brd ff:ff:ff:ff:ff:ff
        inet 192.168.0.107/24 brd 192.168.0.255 scope global dynamic noprefixroute wlp3s0
           valid_lft 6385sec preferred_lft 6385sec
        inet6 fe80::1fda:e90a:207a:67e4/64 scope link noprefixroute
           valid_lft forever preferred_lft forever
    4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
        link/ether 12:cb:66:43:de:36 brd ff:ff:ff:ff:ff:ff
        inet 172.30.22.0/32 scope global flannel.1
           valid_lft forever preferred_lft forever
        inet6 fe80::10cb:66ff:fe43:de36/64 scope link
           valid_lft forever preferred_lft forever
    
    root@master:/opt/k8s/work# ip route show |grep flannel.1
    172.30.78.0/24 via 172.30.78.0 dev flannel.1 onlink 
    
  11. 验证各节点能通过 Pod 网段互通

    root@master:/opt/k8s/work# ip addr show flannel.1 |grep -w inet
        inet 172.30.22.0/32 scope global flannel.1
    root@master:/opt/k8s/work# ssh 192.168.0.114 "/sbin/ip addr show flannel.1|grep -w inet"
        inet 172.30.78.0/32 scope global flannel.1
    root@master:/opt/k8s/work# ping -c 1 172.30.78.0
    PING 172.30.78.0 (172.30.78.0) 56(84) bytes of data.
    64 bytes from 172.30.78.0: icmp_seq=1 ttl=64 time=80.7 ms
    
    --- 172.30.78.0 ping statistics ---
    1 packets transmitted, 1 received, 0% packet loss, time 0ms
    rtt min/avg/max/mdev = 80.707/80.707/80.707/0.000 ms
    root@master:/opt/k8s/work# ssh 192.168.0.114 "ping -c 1 172.30.22.0"
    PING 172.30.22.0 (172.30.22.0) 56(84) bytes of data.
    64 bytes from 172.30.22.0: icmp_seq=1 ttl=64 time=4.09 ms
    
    --- 172.30.22.0 ping statistics ---
    1 packets transmitted, 1 received, 0% packet loss, time 0ms
    rtt min/avg/max/mdev = 4.094/4.094/4.094/0.000 ms
    
    
  12. 生成文件

    root@master:/opt/k8s/work# cat /run/flannel/subnet.env
    FLANNEL_NETWORK=172.30.0.0/16
    FLANNEL_SUBNET=172.30.22.1/24
    FLANNEL_MTU=1450
    FLANNEL_IPMASQ=true
    root@master:/opt/k8s/work# cat /run/flannel/docker
    DOCKER_OPT_BIP="--bip=172.30.22.1/24"
    DOCKER_OPT_IPMASQ="--ip-masq=false"
    DOCKER_OPT_MTU="--mtu=1450"
    DOCKER_NETWORK_OPTIONS=" --bip=172.30.22.1/24 --ip-masq=false --mtu=1450"

部署docker服务(在master节点上执行)

  1. 下载和分发 docker 二进制文件

    cd /opt/k8s/work
    wget https://download.docker.com/linux/static/stable/x86_64/docker-18.09.6.tgz
    tar -xvf docker-18.09.6.tgz
  2. 分发二进制文件到所有 worker 节点

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp docker/*  root@${node_ip}:/opt/k8s/bin/
    ssh root@${node_ip} "chmod +x /opt/k8s/bin/*"
  3. 创建docker服务启动文件

    cd /opt/k8s/work
    cat > docker.service <<"EOF"
    [Unit]
    Description=Docker Application Container Engine
    Documentation=http://docs.docker.io
    
    [Service]
    WorkingDirectory=/data/k8s/docker
    Environment="PATH=/opt/k8s/bin:/bin:/sbin:/usr/bin:/usr/sbin"
    EnvironmentFile=-/run/flannel/docker
    ExecStart=/opt/k8s/bin/dockerd $DOCKER_NETWORK_OPTIONS
    ExecReload=/bin/kill -s HUP $MAINPID
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=infinity
    LimitNPROC=infinity
    LimitCORE=infinity
    Delegate=yes
    KillMode=process
    
    [Install]
    WantedBy=multi-user.target
    EOF
    • EOF 前后有双引号,这样 bash 不会替换文档中的变量,如 $DOCKER_NETWORK_OPTIONS (这些环境变量是 systemd 负责替换的。);

    • dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;

    • flanneld 启动时将网络配置写入 /run/flannel/docker 文件中,dockerd 启动前读取该文件中的环境变量 DOCKER_NETWORK_OPTIONS ,然后设置 docker0 网桥网段;

    • docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:

      export node_ip=192.168.0.114
      ssh root@${node_ip}  "/sbin/iptables -P FORWARD ACCEPT"
  4. 分发 docker.service 文件到所有 worker 机器:

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp docker.service root@${node_ip}:/etc/systemd/system/
  5. 配置和分发 docker 配置文件

    使用国内的仓库镜像服务器以加快 pull image 的速度,同时增加下载的并发数 (需要重启 dockerd 生效):

    cd /opt/k8s/work
    cat > docker-daemon.json <<EOF
    {
        "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","https://hub-mirror.c.163.com"],
        "max-concurrent-downloads": 20,
        "live-restore": true,
        "max-concurrent-uploads": 10,
        "data-root": "/data/k8s/docker/data",
        "log-opts": {
          "max-size": "100m",
          "max-file": "5"
        }
    }
    EOF
    
  6. 分发 docker 配置文件到所有 worker 节点:

    cd /opt/k8s/work
    
    export node_ip=192.168.0.114
    ssh root@${node_ip} "mkdir -p  /etc/docker/ /data/k8s/docker/data"
    scp docker-daemon.json root@${node_ip}:/etc/docker/daemon.json
  7. 启动 docker 服务

    export node_ip=192.168.0.114
    ssh root@${node_ip} "systemctl daemon-reload && systemctl enable docker && systemctl restart docker"
  8. 检查服务运行状态

    export node_ip=192.168.0.114
    ssh root@${node_ip} "systemctl status docker|grep Active"
    • 确保状态为 active (running),否则查看日志,确认原因
    • 如果出现异常,通过如下命令查看

      journalctl -u docker
  9. 检查 docker0 网桥

    export node_ip=192.168.0.114
    ssh root@${node_ip} "/sbin/ip addr show flannel.1 && /sbin/ip addr show docker0"
    • 确认各 worker 节点的 docker0 网桥和 flannel.1 接口的 IP 处于同一个网段中

      输出内容

      export node_ip=192.168.0.114
      root@master:/opt/k8s/work# ssh root@${node_ip} "/sbin/ip addr show flannel.1 && /sbin/ip addr show docker0"
      4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
          link/ether f2:fc:0f:7e:98:e4 brd ff:ff:ff:ff:ff:ff
          inet 172.30.78.0/32 scope global flannel.1
             valid_lft forever preferred_lft forever
          inet6 fe80::f0fc:fff:fe7e:98e4/64 scope link
             valid_lft forever preferred_lft forever
      5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
          link/ether 02:42:fd:1f:8f:d8 brd ff:ff:ff:ff:ff:ff
          inet 172.30.78.1/24 brd 172.30.78.255 scope global docker0
             valid_lft forever preferred_lft forever
      
    • 注意: 如果您的服务安装顺序不对或者机器环境比较复杂, docker服务早于flanneld服务安装,此时 worker 节点的 docker0 网桥和 flannel.1 接口的 IP可能不会同处同一个网段下,这个时候请先停止docker服务, 手工删除docker0网卡,重新启动docker服务后即可修复

      systemctl stop docker
      ip link delete docker0
      systemctl start docker
  10. 查看 docker 的状态信息

    root@slave:/opt/k8s/work# docker info
    Containers: 0
     Running: 0
     Paused: 0
     Stopped: 0
    Images: 0
    Server Version: 18.09.6
    Storage Driver: overlay2
     Backing Filesystem: extfs
     Supports d_type: true
     Native Overlay Diff: true
    Logging Driver: json-file
    Cgroup Driver: cgroupfs
    Plugins:
     Volume: local
     Network: bridge host macvlan null overlay
     Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
    Swarm: inactive
    Runtimes: runc
    Default Runtime: runc
    Init Binary: docker-init
    containerd version: bb71b10fd8f58240ca47fbb579b9d1028eea7c84
    runc version: 2b18fe1d885ee5083ef9f0838fee39b62d653e30
    init version: fec3683
    Security Options:
     apparmor
     seccomp
      Profile: default
    Kernel Version: 5.0.0-23-generic
    Operating System: Ubuntu 18.04.3 LTS
    OSType: linux
    Architecture: x86_64
    CPUs: 4
    Total Memory: 3.741GiB
    Name: slave
    ID: IDMG:7A6F:UNTP:IWVM:ZBK5:VHJ4:STC5:UXZX:HQT6:UUNE:YDOC:I27L
    Docker Root Dir: /data/k8s/docker/data
    Debug Mode (client): false
    Debug Mode (server): false
    Registry: https://index.docker.io/v1/
    Labels:
    Experimental: false
    Insecure Registries:
     127.0.0.0/8
    Registry Mirrors:
     https://docker.mirrors.ustc.edu.cn/
     https://hub-mirror.c.163.com/
    Live Restore Enabled: true
    Product License: Community Engine
    
    WARNING: No swap limit support

部署 master 节点(在master节点上执行)

  1. 下载最新版本二进制文件

    cd /opt/k8s/work
    
    wget https://dl.k8s.io/v1.17.2/kubernetes-server-linux-amd64.tar.gz # 目前国内不能直接下载,需翻墙
    tar -xzvf kubernetes-server-linux-amd64.tar
    
  2. 安装对应的k8s命令

    cd /opt/k8s/work
    cp kubernetes/server/bin/{apiextensions-apiserver,kubeadm,kube-apiserver,kube-controller-manager,kubectl,kubelet,kube-proxy,kube-scheduler,mounter} /opt/k8s/bin/
    
    #将kubelet、kube-proxy分发到worker节点
    export node_ip=192.168.0.114
    scp kubernetes/server/bin/{kubelet,kube-proxy} root@${node_ip}:/opt/k8s/bin/

配置kubectl

kubectl 使用 https 协议与 kube-apiserver 进行安全通信,kube-apiserver 对 kubectl 请求包含的证书进行认证和授权。

kubectl 后续用于集群管理,所以这里创建具有最高权限的 admin 证书。

  1. 创建 admin 证书和私钥
    1. 创建证书签名请求文件

      
      cd /opt/k8s/work
      cat > admin-csr.json <<EOF
      {
        "CN": "admin",
        "hosts": [],
        "key": {
          "algo": "rsa",
          "size": 2048
        },
        "names": [
          {
            "C": "CN",
            "ST": "NanJing",
            "L": "NanJing",
            "O": "system:masters",
            "OU": "system"
          }
        ]
      }
      EOF
      
      • O: system:masters:kube-apiserver 收到使用该证书的客户端请求后,为请求添加组(Group)认证标识 system:masters;
      • 预定义的 ClusterRoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予操作集群所需的最高权限;
      • 该证书只会被 kubectl 当做 client 证书使用,所以 hosts 字段为空;
    2. 生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
        -ca-key=/opt/k8s/work/ca-key.pem \
        -config=/opt/k8s/work/ca-config.json \
        -profile=kubernetes admin-csr.json | cfssljson -bare admin
      ls admin*
    3. 安装证书

      cd /opt/k8s/work
      cp admin*.pem /etc/kubernetes/cert
  2. 创建 kubeconfig 文件

    cd /opt/k8s/work
    
    export KUBE_APISERVER=https://192.168.0.107:6443
    
    # 设置集群参数
    kubectl config set-cluster kubernetes \
      --certificate-authority=/etc/kubernetes/cert/ca.pem \
      --embed-certs=true \
      --server=${KUBE_APISERVER} \
      --kubeconfig=kubectl.kubeconfig
    
    # 设置客户端认证参数
    kubectl config set-credentials admin \
      --client-certificate=/etc/kubernetes/cert/admin.pem \
      --client-key=/etc/kubernetes/cert/admin-key.pem \
      --embed-certs=true \
      --kubeconfig=kubectl.kubeconfig
    
    # 设置上下文参数
    kubectl config set-context kubernetes \
      --cluster=kubernetes \
      --user=admin \
      --kubeconfig=kubectl.kubeconfig
    
    # 设置默认上下文
    kubectl config use-context kubernetes --kubeconfig=kubectl.kubeconfig
    
    • --certificate-authority:验证 kube-apiserver 证书的根证书;
    • --client-certificate、--client-key:刚生成的 admin 证书和私钥,与 kube-apiserver https 通信时使用;
    • --embed-certs=true:将 ca.pem 和 admin.pem 证书内容嵌入到生成的 kubectl.kubeconfig 文件中;
    • --server:指定 kube-apiserver 的地址;
  3. 分发 kubeconfig 文件(其他用户想要访问kubernetes时,也需要把此文件copy到对应的用户目录)

    cd /opt/k8s/work
    mkdir -p ~/.kube
    cp kubectl.kubeconfig ~/.kube/config
    
  4. 配置kubectl自动补全功能

    root@master:/opt/k8s/work# apt install -y bash-completion
    root@master:/opt/k8s/work# locate bash_completion   /usr/share/bash-completion/bash_completion
    root@master:/opt/k8s/work# source /usr/share/bash-completion/bash_completion
    root@master:/opt/k8s/work# source <(kubectl completion bash)
    

配置kube-apiserver

  1. 创建 kubernetes-api 证书和私钥

    1. 创建证书签名请求文件

      
      cd /opt/k8s/work
      cat > kubernetes-csr.json <<EOF
      {
        "CN": "kubernetes-api",
        "hosts": [
          "127.0.0.1",
          "192.168.0.107",
          "10.254.0.1",
          "kubernetes",
          "kubernetes.default",
          "kubernetes.default.svc",
          "kubernetes.default.svc.cluster",
          "kubernetes.default.svc.cluster.local."
        ],
        "key": {
          "algo": "rsa",
          "size": 2048
        },
        "names": [
          {
            "C": "CN",
            "ST": "NanJing",
            "L": "NanJing",
            "O": "k8s",
            "OU": "system"
          }
        ]
      }
      EOF
      
    2. 生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
        -ca-key=/opt/k8s/work/ca-key.pem \
        -config=/opt/k8s/work/ca-config.json \
        -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes
      ls kubernetes*
    3. 安装证书

      cd /opt/k8s/work
      cp kubernetes*.pem /etc/kubernetes/cert/
      
  2. 创建kube-api服务启动文件

    export ETCD_ENDPOINTS="https://192.168.0.107:2379"
    export SERVICE_CIDR="10.254.0.0/16"
    export NODE_PORT_RANGE=80-32767
    
    cat > /etc/systemd/system/kube-apiserver.service <<EOF
    [Unit]
    Description=Kubernetes API Server
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=network.target
    
    [Service]
    WorkingDirectory=/data/k8s/k8s/kube-apiserver
    ExecStart=/opt/k8s/bin/kube-apiserver \\
      --advertise-address=192.168.0.107 \\
      --etcd-cafile=/etc/kubernetes/cert/ca.pem \\
      --etcd-certfile=/etc/kubernetes/cert/kubernetes.pem \\
      --etcd-keyfile=/etc/kubernetes/cert/kubernetes-key.pem \\
      --etcd-servers=${ETCD_ENDPOINTS} \\
      --bind-address=192.168.0.107 \\
      --secure-port=6443 \\
      --tls-cert-file=/etc/kubernetes/cert/kubernetes.pem \\
      --tls-private-key-file=/etc/kubernetes/cert/kubernetes-key.pem \\
      --audit-log-maxage=15 \\
      --audit-log-maxbackup=3 \\
      --audit-log-maxsize=100 \\
      --audit-log-truncate-enabled \\
      --audit-log-path=/data/k8s/k8s/kube-apiserver/audit.log \\
      --profiling \\
      --anonymous-auth=false \\
      --client-ca-file=/etc/kubernetes/cert/ca.pem \\
      --enable-bootstrap-token-auth \\
      --service-account-key-file=/etc/kubernetes/cert/ca-key.pem \\
      --authorization-mode=Node,RBAC \\
      --runtime-config=api/all=true \\
      --allow-privileged=true \\
      --event-ttl=168h \\
      --kubelet-certificate-authority=/etc/kubernetes/cert/ca.pem \\
      --kubelet-client-certificate=/etc/kubernetes/cert/kubernetes.pem \\
      --kubelet-client-key=/etc/kubernetes/cert/kubernetes-key.pem \\
      --kubelet-https=true \\
      --kubelet-timeout=10s \\
      --service-cluster-ip-range=${SERVICE_CIDR} \\
      --service-node-port-range=${NODE_PORT_RANGE} \\
      --logtostderr=true \\
      --v=2
    Restart=on-failure
    RestartSec=10
    Type=notify
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    EOF
  3. 创建kube-api工作目录

    mkdir -p /data/k8s/k8s/kube-apiserver
  4. 启动 kube-apiserver 服务

    systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-apiserver
  5. 检查启动结果

    systemctl status kube-apiserver |grep Active
    • 确保状态为 active (running),否则查看日志,确认原因
    • 如果出现异常,通过如下命令查看

      journalctl -u kube-apiserver
  6. 检查 kube-apiserver 运行状态

    root@master:/opt/k8s/work# kubectl cluster-info
    Kubernetes master is running at https://192.168.0.107:6443
    
    To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
    
    root@master:/opt/k8s/work# kubectl get all --all-namespaces
    NAMESPACE   NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
    default     service/kubernetes   ClusterIP   10.254.0.1   <none>        443/TCP   2m30s
    
    root@master:/opt/k8s/work# kubectl get componentstatuses
    NAME                 STATUS      MESSAGE                                                                                     ERROR
    scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: connect: connection refused
    controller-manager   Unhealthy   Get http://127.0.0.1:10252/healthz: dial tcp 127.0.0.1:10252: connect: connection refused
    etcd-0               Healthy     {"health":"true"}                                                                      

配置kube-controller-manager

  1. 创建 kube-controller-manager 证书和私钥
    1. 创建证书签名请求文件

      cd /opt/k8s/work
      cat > kube-controller-manager-csr.json <<EOF
      {
          "CN": "system:kube-controller-manager",
          "key": {
              "algo": "rsa",
              "size": 2048
          },
          "hosts": [
            "127.0.0.1",
            "192.168.0.107"
          ],
          "names": [
            {
              "C": "CN",
              "ST": "NanJing",
              "L": "NanJing",
              "O": "system:kube-controller-manager",
              "OU": "system"
            }
          ]
      }
      EOF
      • CN 和 O 均为 system:kube-controller-manager,kubernetes 内置的 ClusterRoleBindings system:kube-controller-manager 赋予 kube-controller-manager 工作所需的权限。
    2. 生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
        -ca-key=/opt/k8s/work/ca-key.pem \
        -config=/opt/k8s/work/ca-config.json \
        -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
      ls kube-controller-manager*pem
    3. 安装证书

      cd /opt/k8s/work
      cp kube-controller-manager*.pem /etc/kubernetes/cert/
  2. 创建 kubeconfig 文件

    • kube-controller-manager 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-controller-manager 证书等信息
    cd /opt/k8s/work
    export KUBE_APISERVER=https://192.168.0.107:6443
    
    kubectl config set-cluster kubernetes \
      --certificate-authority=/opt/k8s/work/ca.pem \
      --embed-certs=true \
      --server="${KUBE_APISERVER}" \
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config set-credentials system:kube-controller-manager \
      --client-certificate=kube-controller-manager.pem \
      --client-key=kube-controller-manager-key.pem \
      --embed-certs=true \
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config set-context system:kube-controller-manager \
      --cluster=kubernetes \
      --user=system:kube-controller-manager \
      --kubeconfig=kube-controller-manager.kubeconfig
    
    kubectl config use-context system:kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig
    
  3. 分发 kubeconfig

    cd /opt/k8s/work
    cp kube-controller-manager.kubeconfig /etc/kubernetes/kube-controller-manager.kubeconfig
    
  4. 创建kube-controller-manager服务启动文件

    export SERVICE_CIDR="10.254.0.0/16"
    
    cat > /etc/systemd/system/kube-controller-manager.service <<EOF
    [Unit]
    Description=Kubernetes Controller Manager
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    
    [Service]
    WorkingDirectory=/data/k8s/k8s/kube-controller-manager
    ExecStart=/opt/k8s/bin/kube-controller-manager \\
      --profiling \\
      --cluster-name=kubernetes \\
      --kube-api-qps=1000 \\
      --kube-api-burst=2000 \\
      --leader-elect \\
      --use-service-account-credentials\\
      --concurrent-service-syncs=2 \\
      --bind-address=192.168.0.107 \\
      --secure-port=10252 \\
      --tls-cert-file=/etc/kubernetes/cert/kube-controller-manager.pem \\
      --tls-private-key-file=/etc/kubernetes/cert/kube-controller-manager-key.pem \\
      --port=0 \\
      --authentication-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\
      --client-ca-file=/etc/kubernetes/cert/ca.pem \\
      --authorization-kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\
      --cluster-signing-cert-file=/etc/kubernetes/cert/ca.pem \\
      --cluster-signing-key-file=/etc/kubernetes/cert/ca-key.pem \\
      --experimental-cluster-signing-duration=87600h \\
      --horizontal-pod-autoscaler-sync-period=10s \\
      --concurrent-deployment-syncs=10 \\
      --concurrent-gc-syncs=30 \\
      --node-cidr-mask-size=24 \\
      --service-cluster-ip-range=${SERVICE_CIDR} \\
      --pod-eviction-timeout=6m \\
      --terminated-pod-gc-threshold=10000 \\
      --root-ca-file=/etc/kubernetes/cert/ca.pem \\
      --service-account-private-key-file=/etc/kubernetes/cert/ca-key.pem \\
      --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\
      --logtostderr=true \\
      --v=2
    Restart=on-failure
    RestartSec=5
    
    [Install]
    WantedBy=multi-user.target
    EOF
  5. 创建kube-controller-manager工作目录

    mkdir -p /data/k8s/k8s/kube-controller-manager
  6. 启动 kube-controller-manager服务

    systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-controller-manager
  7. 检查启动结果

    systemctl status kube-controller-manager  |grep Active
    • 确保状态为 active (running),否则查看日志,确认原因
    • 如果出现异常,通过如下命令查看

      journalctl -u kube-controller-manager
      
  8. 检查 kube-controller-manager 运行状态

    root@master:/opt/k8s/work# kubectl get endpoints kube-controller-manager --namespace=kube-system  -o yaml
    apiVersion: v1
    kind: Endpoints
    metadata:
      annotations:
        control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master_6e2dfb91-8eaa-42d0-ba83-be669b99801f","leaseDurationSeconds":15,"acquireTime":"2020-02-09T13:37:08Z","renewTime":"2020-02-09T13:38:02Z","leaderTransitions":0}'
      creationTimestamp: "2020-02-09T13:37:08Z"
      name: kube-controller-manager
      namespace: kube-system
      resourceVersion: "888"
      selfLink: /api/v1/namespaces/kube-system/endpoints/kube-controller-manager
      uid: 5aa2c4a1-5ded-4870-900e-63dfd212c912
    
    root@master:/opt/k8s/work# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.0.107:10252/healthz
    ok
    

配置kube-scheduler

  1. 创建 kube-scheduler 证书和私钥
    1. 创建证书签名请求文件

      cd /opt/k8s/work
      cat > kube-scheduler-csr.json <<EOF
      {
          "CN": "system:kube-scheduler",
          "key": {
              "algo": "rsa",
              "size": 2048
          },
          "hosts": [
            "127.0.0.1",
            "192.168.0.107"
          ],
          "names": [
            {
              "C": "CN",
              "ST": "NanJing",
              "L": "NanJing",
              "O": "system:kube-scheduler",
              "OU": "system"
            }
          ]
      }
      EOF
      
      • CN 和 O 均为 system:kube-scheduler,kubernetes 内置的 ClusterRoleBindings system:kube-scheduler 赋予 kube-scheduler 工作所需的权限。
    2. 生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
        -ca-key=/opt/k8s/work/ca-key.pem \
        -config=/opt/k8s/work/ca-config.json \
        -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler
      ls kube-scheduler*pem
      
    3. 安装证书

      cd /opt/k8s/work
      cp kube-scheduler*.pem /etc/kubernetes/cert/
      
  2. 创建 kubeconfig 文件

    • kube-scheduler 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-scheduler证书等信息
    cd /opt/k8s/work
    export KUBE_APISERVER=https://192.168.0.107:6443
    
    kubectl config set-cluster kubernetes \
      --certificate-authority=/opt/k8s/work/ca.pem \
      --embed-certs=true \
      --server="${KUBE_APISERVER}" \
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config set-credentials system:kube-scheduler \
      --client-certificate=kube-scheduler.pem \
      --client-key=kube-scheduler-key.pem \
      --embed-certs=true \
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config set-context system:kube-scheduler \
      --cluster=kubernetes \
      --user=system:kube-scheduler \
      --kubeconfig=kube-scheduler.kubeconfig
    
    kubectl config use-context system:kube-scheduler --kubeconfig=kube-scheduler.kubeconfig
    
  3. 分发 kubeconfig

    cd /opt/k8s/work
    cp kube-scheduler.kubeconfig /etc/kubernetes/kube-scheduler.kubeconfig
    
  4. 创建 kube-scheduler 配置文件

    cd /opt/k8s/work
    cat >kube-scheduler.yaml <<EOF
    apiVersion: kubescheduler.config.k8s.io/v1alpha1
    kind: KubeSchedulerConfiguration
    bindTimeoutSeconds: 600
    clientConnection:
      burst: 200
      kubeconfig: "/etc/kubernetes/kube-scheduler.kubeconfig"
      qps: 100
    enableContentionProfiling: false
    enableProfiling: true
    hardPodAffinitySymmetricWeight: 1
    healthzBindAddress: 192.168.0.107:10251
    leaderElection:
      leaderElect: true
    metricsBindAddress: 192.168.0.107:10251
    EOF
    
    cp kube-scheduler.yaml /etc/kubernetes/kube-scheduler.yaml
  5. 创建kube-scheduler服务启动文件

    cat > /etc/systemd/system/kube-scheduler.service <<EOF
    [Unit]
    Description=Kubernetes Scheduler
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    
    [Service]
    WorkingDirectory=/data/k8s/k8s/kube-scheduler
    ExecStart=/opt/k8s/bin/kube-scheduler \\
      --config=/etc/kubernetes/kube-scheduler.yaml \\
      --bind-address=192.168.0.107 \\
      --secure-port=10259 \\
      --port=0 \\
      --tls-cert-file=/etc/kubernetes/cert/kube-scheduler.pem \\
      --tls-private-key-file=/etc/kubernetes/cert/kube-scheduler-key.pem \\
      --authentication-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\
      --client-ca-file=/etc/kubernetes/cert/ca.pem \\
      --authorization-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\
      --logtostderr=true \\
      --v=2
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    
    [Install]
    WantedBy=multi-user.target
    EOF
  6. 创建kube-scheduler工作目录

    mkdir -p /data/k8s/k8s/kube-scheduler
  7. 启动 kube-scheduler服务

    systemctl daemon-reload && systemctl enable kube-apiserver && systemctl restart kube-scheduler
  8. 检查启动结果

    systemctl status kube-scheduler  |grep Active
    • 确保状态为 active (running),否则查看日志,确认原因
    • 如果出现异常,通过如下命令查看

      journalctl -u kube-scheduler
      
  9. 检查 kube-scheduler 运行状态

    root@master:/opt/k8s/work# kubectl get endpoints kube-scheduler --namespace=kube-system  -o yaml
    apiVersion: v1
    kind: Endpoints
    metadata:
      annotations:
        control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"master_383054c4-58d8-4c24-a766-551a92492219","leaseDurationSeconds":15,"acquireTime":"2020-02-10T02:17:40Z","renewTime":"2020-02-10T02:18:09Z","leaderTransitions":0}'
      creationTimestamp: "2020-02-10T02:17:41Z"
      name: kube-scheduler
      namespace: kube-system
      resourceVersion: "50203"
      selfLink: /api/v1/namespaces/kube-system/endpoints/kube-scheduler
      uid: 39821272-40a1-4b3a-95bd-a4f09af09231
    
    root@master:/opt/k8s/work# curl -s --cacert /opt/k8s/work/ca.pem --cert /opt/k8s/work/admin.pem --key /opt/k8s/work/admin-key.pem https://192.168.0.107:10259/healthz
    ok
    
    root@master:/opt/k8s/work# curl  http://192.168.0.107:10251/healthz
    ok
    

部署worker节点(在master节点上执行)

配置kubelet

kubelet 运行在每个 worker 节点上,接收 kube-apiserver 发送的请求,管理 Pod 容器,执行交互式命令,如 exec、run、logs 等。

kubelet 启动时自动向 kube-apiserver 注册节点信息,内置的 cadvisor 统计和监控节点的资源使用情况。

为确保安全,部署时关闭了 kubelet 的非安全 http 端口,对请求进行认证和授权,拒绝未授权的访问(如 apiserver、heapster 的请求)。

  1. 创建 kubelet bootstrap kubeconfig 文件

    
    cd /opt/k8s/work
    
    export KUBE_APISERVER=https://192.168.0.107:6443
    export node_name=slave
    
    export BOOTSTRAP_TOKEN=$(kubeadm token create \
      --description kubelet-bootstrap-token \
      --groups system:bootstrappers:${node_name} \
      --kubeconfig ~/.kube/config)
    
    # 设置集群参数
    kubectl config set-cluster kubernetes \
      --certificate-authority=/etc/kubernetes/cert/ca.pem \
      --embed-certs=true \
      --server=${KUBE_APISERVER} \
      --kubeconfig=kubelet-bootstrap.kubeconfig
    
    # 设置客户端认证参数
    kubectl config set-credentials kubelet-bootstrap \
      --token=${BOOTSTRAP_TOKEN} \
      --kubeconfig=kubelet-bootstrap.kubeconfig
    
    # 设置上下文参数
    kubectl config set-context default \
      --cluster=kubernetes \
      --user=kubelet-bootstrap \
      --kubeconfig=kubelet-bootstrap.kubeconfig
    
    # 设置默认上下文
    kubectl config use-context default --kubeconfig=kubelet-bootstrap.kubeconfig
    
    • 向 kubeconfig 写入的是 token,bootstrap 结束后 kube-controller-manager 为 kubelet 创建 client 和 server 证书
    • kube-apiserver 接收 kubelet 的 bootstrap token 后,将请求的 user 设置为 system:bootstrap: ,group 设置为 system:bootstrappers,后续将为这个 group 设置 ClusterRoleBinding
  2. 分发 bootstrap kubeconfig 文件到所有 worker 节点

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kubelet-bootstrap.kubeconfig root@${node_ip}:/etc/kubernetes/kubelet-bootstrap.kubeconfig
  3. 创建和分发 kubelet 参数配置文件

    从 v1.10 开始,部分 kubelet 参数需在配置文件中配置,kubelet --help 会提示

    cd /opt/k8s/work
    
    export CLUSTER_CIDR="172.30.0.0/16"
    export NODE_IP=192.168.0.114
    export CLUSTER_DNS_SVC_IP="10.254.0.2"
    
    
    cat > kubelet-config.yaml <<EOF
    kind: KubeletConfiguration
    apiVersion: kubelet.config.k8s.io/v1beta1
    address: 192.168.0.114
    staticPodPath: ""
    syncFrequency: 1m
    fileCheckFrequency: 20s
    httpCheckFrequency: 20s
    staticPodURL: ""
    port: 10250
    readOnlyPort: 0
    rotateCertificates: true
    serverTLSBootstrap: true
    authentication:
      anonymous:
        enabled: false
      webhook:
        enabled: true
      x509:
        clientCAFile: "/etc/kubernetes/cert/ca.pem"
    authorization:
      mode: Webhook
    registryPullQPS: 0
    registryBurst: 20
    eventRecordQPS: 0
    eventBurst: 20
    enableDebuggingHandlers: true
    enableContentionProfiling: true
    healthzPort: 10248
    healthzBindAddress: ${NODE_IP}
    clusterDomain: "cluster.local"
    clusterDNS:
      - "${CLUSTER_DNS_SVC_IP}"
    nodeStatusUpdateFrequency: 10s
    nodeStatusReportFrequency: 1m
    imageMinimumGCAge: 2m
    imageGCHighThresholdPercent: 85
    imageGCLowThresholdPercent: 80
    volumeStatsAggPeriod: 1m
    kubeletCgroups: ""
    systemCgroups: ""
    cgroupRoot: ""
    cgroupsPerQOS: true
    cgroupDriver: cgroupfs
    runtimeRequestTimeout: 10m
    hairpinMode: promiscuous-bridge
    maxPods: 220
    podCIDR: "${CLUSTER_CIDR}"
    podPidsLimit: -1
    resolvConf: /run/systemd/resolve/resolv.conf
    maxOpenFiles: 1000000
    kubeAPIQPS: 1000
    kubeAPIBurst: 2000
    serializeImagePulls: false
    evictionHard:
      memory.available:  "100Mi"
      nodefs.available:  "10%"
      nodefs.inodesFree: "5%"
      imagefs.available: "15%"
    evictionSoft: {}
    enableControllerAttachDetach: true
    failSwapOn: true
    containerLogMaxSize: 20Mi
    containerLogMaxFiles: 10
    systemReserved: {}
    kubeReserved: {}
    systemReservedCgroup: ""
    kubeReservedCgroup: ""
    enforceNodeAllocatable: ["pods"]
    EOF
    • address:kubelet 安全端口(https,10250)监听的地址,不能为 127.0.0.1,否则 kube-apiserver、heapster 等不能调用 kubelet 的 API;
    • readOnlyPort=0:关闭只读端口(默认 10255),等效为未指定;
    • authentication.anonymous.enabled:设置为 false,不允许匿名访问 10250 端口;
    • authentication.x509.clientCAFile:指定签名客户端证书的 CA 证书,开启 HTTP 证书认证;
    • authentication.webhook.enabled=true:开启 HTTPs bearer token 认证;
      对于未通过 x509 证书和 webhook 认证的请求(kube-apiserver 或其他客户端),将被拒绝,提示 Unauthorized;
    • authroization.mode=Webhook:kubelet 使用 SubjectAccessReview API 查询 kube-apiserver 某 user、group 是否具有操作资源的权限(RBAC);
    • featureGates.RotateKubeletClientCertificate、featureGates.RotateKubeletServerCertificate:自动 rotate 证书,证书的有效期取决于 kube-controller-manager 的 --experimental-cluster-signing-duration 参数
  4. 为各节点创建和分发 kubelet 配置文件

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kubelet-config.yaml root@${node_ip}:/etc/kubernetes/kubelet-config.yaml
    
  5. 创建和分发 kubelet 服务启动文件

    cd /opt/k8s/work
    export K8S_DIR=/data/k8s/k8s
    cat > kubelet.service <<EOF
    [Unit]
    Description=Kubernetes Kubelet
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=docker.service
    Requires=docker.service
    
    [Service]
    WorkingDirectory=${K8S_DIR}/kubelet
    ExecStart=/opt/k8s/bin/kubelet \\
      --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \\
      --cert-dir=/etc/kubernetes/cert \\
      --root-dir=${K8S_DIR}/kubelet \\
      --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\
      --config=/etc/kubernetes/kubelet-config.yaml \\
      --hostname-override=slave \\
      --image-pull-progress-deadline=15m \\
      --volume-plugin-dir=${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/ \\
      --logtostderr=true \\
      --v=2
    Restart=always
    RestartSec=5
    StartLimitInterval=0
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
    • 如果设置了 --hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况;
    • --bootstrap-kubeconfig:指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
    • K8S approve kubelet 的 csr 请求后,在 --cert-dir 目录创建证书和私钥文件,然后写入 --kubeconfig 文件
  6. 安装分发kubelet服务文件

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kubelet.service root@${node_ip}:/etc/systemd/system/kubelet.service
  7. 授予 kube-apiserver 访问 kubelet API 的权限

    在执行 kubectl exec、run、logs 等命令时,apiserver 会将请求转发到 kubelet 的 https 端口。这里定义 RBAC 规则,授权 apiserver 使用的证书(kubernetes.pem)用户名(CN:kubernetes-api)访问 kubelet API 的权限:

    kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes-api
    
  8. Bootstrap Token Auth 和授予权限
    kubelet 启动时查找 --kubeletconfig 参数对应的文件是否存在,如果不存在则使用 --bootstrap-kubeconfig 指定的 kubeconfig 文件向 kube-apiserver 发送证书签名请求 (CSR)。

    kube-apiserver 收到 CSR 请求后,对其中的 Token 进行认证,认证通过后将请求的 user 设置为 system:bootstrap: ,group 设置为 system:bootstrappers,这一过程称为 Bootstrap Token Auth。

    默认情况下,这个 user 和 group 没有创建 CSR 的权限, 需要创建一个 clusterrolebinding,将 group system:bootstrappers 和 clusterrole system:node-bootstrapper 绑定:

    kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --group=system:bootstrappers
    
  9. 启动 kubelet 服务

    export K8S_DIR=/data/k8s/k8s
    
    export node_ip=192.168.0.114
    ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kubelet/kubelet-plugins/volume/exec/"
    
    ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kubelet && systemctl restart kubelet"
    
    • kubelet 启动后使用 --bootstrap-kubeconfig 向 kube-apiserver 发送 CSR 请求,当这个 CSR 被 approve 后,kube-controller-manager 为 kubelet 创建 TLS 客户端证书、私钥和 --kubeletconfig 文件。

    • 注意:kube-controller-manager 需要配置 --cluster-signing-cert-file 和 --cluster-signing-key-file 参数,才会为 TLS Bootstrap 创建证书和私钥。

  10. 遇到问题

    1. 启动kubelet后,使用 kubectl get csr 没有结果,查看kubelet出现错误

      journalctl -u kubelet -a |grep -A 2 'certificate_manager.go' 
      
      Failed while requesting a signed certificate from the master: cannot create certificate signing request: Unauthorized 
      

      查看kube-api服务日志

      root@master:/opt/k8s/work# journalctl -eu kube-apiserver
      
      Unable to authenticate the request due to an error: invalid bearer token

      原因,在kube-apiserver服务的启动文件中丢掉了下面的配置

      --enable-bootstrap-token-auth \\

      追加上,重新启动kube-apiserver后解决

    2. kubelet 启动后持续不断的产生csr,手动approve后还继续产生
      原因是kube-controller-manager服务停止掉了,重新启动后解决

      • kubelet服务出问题后 要删除对应节点的/etc/kubernetes/kubelet.kubeconfig和/etc/kubernetes/cert/kubelet-client-current*.pem、/etc/kubernetes/cert/kubelet-client-current*.pem,之后再重新启动kubelet
  11. 查看 kubelet 情况

    root@master:/opt/k8s/work# kubectl get csr
    NAME        AGE   REQUESTOR                 CONDITION
    csr-kl5mg   49s   system:bootstrap:5t989l   Pending
    csr-mrmkf   2m1s  system:bootstrap:5t989l   Pending
    csr-ql68g   13s   system:bootstrap:5t989l   Pending
    csr-rvl2v   84s   system:bootstrap:5t989l   Pending
    
    • 执行时,在手动approve之前会一直追加csr
  12. 手动 approve csr

    root@master:/opt/k8s/work# kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
    certificatesigningrequest.certificates.k8s.io/csr-kl5mg approved
    certificatesigningrequest.certificates.k8s.io/csr-mrmkf approved
    certificatesigningrequest.certificates.k8s.io/csr-ql68g approved
    certificatesigningrequest.certificates.k8s.io/csr-rvl2v approved
    
    root@master:/opt/k8s/work# kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
    certificatesigningrequest.certificates.k8s.io/csr-f4smx approved
    
  13. 查看node信息

    root@master:/opt/k8s/work# kubectl get nodes
    NAME    STATUS   ROLES    AGE   VERSION
    slave   Ready    <none>   10m   v1.17.2
    
  14. 查看kubelet服务状态

    export node_ip=192.168.0.114
    root@master:/opt/k8s/work# ssh root@${node_ip} "systemctl status kubelet.service"
    ● kubelet.service - Kubernetes Kubelet
       Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
       Active: active (running) since Mon 2020-02-10 22:48:41 CST; 12min ago
         Docs: https://github.com/GoogleCloudPlatform/kubernetes
     Main PID: 15529 (kubelet)
        Tasks: 19 (limit: 4541)
       CGroup: /system.slice/kubelet.service
               └─15529 /opt/k8s/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig --cert-dir=/etc/kubernetes/cert --root-dir=/data/k8s/k8s/kubelet --kubeconfig=/etc/kubernetes/kubelet.kubeconfig --config=/etc/kubernetes/kubelet-config.yaml --hostname-override=slave --image-pull-progress-deadline=15m --volume-plugin-dir=/data/k8s/k8s/kubelet/kubelet-plugins/volume/exec/ --logtostderr=true --v=2
    
    2月 10 22:49:04 slave kubelet[15529]: I0210 22:49:04.846285   15529 kubelet_node_status.go:73] Successfully registered node slave
    2月 10 22:49:04 slave kubelet[15529]: I0210 22:49:04.930745   15529 certificate_manager.go:402] Rotating certificates
    2月 10 22:49:14 slave kubelet[15529]: I0210 22:49:14.966351   15529 kubelet_node_status.go:486] Recording NodeReady event message for node slave
    2月 10 22:49:29 slave kubelet[15529]: I0210 22:49:29.580410   15529 certificate_manager.go:531] Certificate expiration is 2030-02-06 04:19:00 +0000 UTC, rotation deadline is 2029-01-21 13:08:18.850930128 +0000 UTC
    2月 10 22:49:29 slave kubelet[15529]: I0210 22:49:29.580484   15529 certificate_manager.go:281] Waiting 78430h18m49.270459727s for next certificate rotation
    2月 10 22:49:30 slave kubelet[15529]: I0210 22:49:30.580981   15529 certificate_manager.go:531] Certificate expiration is 2030-02-06 04:19:00 +0000 UTC, rotation deadline is 2027-07-14 16:09:26.990162158 +0000 UTC
    2月 10 22:49:30 slave kubelet[15529]: I0210 22:49:30.581096   15529 certificate_manager.go:281] Waiting 65065h19m56.409078053s for next certificate rotation
    2月 10 22:53:44 slave kubelet[15529]: I0210 22:53:44.911705   15529 kubelet.go:1312] Image garbage collection succeeded
    2月 10 22:53:45 slave kubelet[15529]: I0210 22:53:45.053792   15529 container_manager_linux.go:469] [ContainerManager]: Discovered runtime cgroups name: /system.slice/docker.service
    2月 10 22:58:45 slave kubelet[15529]: I0210 22:58:45.054225   15529 container_manager_linux.go:469] [ContainerManager]: Discovered runtime cgroups name: /system.slice/docker.servic
    

配置kube-proxy 组件

  1. 创建 kube-proxy 证书和私钥
    1. 创建证书签名请求文件

      cd /opt/k8s/work
      cat > kube-proxy-csr.json <<EOF
      {
          "CN": "system:kube-proxy",
          "key": {
              "algo": "rsa",
              "size": 2048
          },
          "names": [
            {
              "C": "CN",
              "ST": "NanJing",
              "L": "NanJing",
              "O": "system:kube-proxy",
              "OU": "system"
            }
          ]
      }
      EOF
      
      • CN:指定该证书的 User 为 system:kube-proxy;
      • 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限。
    2. 生成证书和私钥

      cd /opt/k8s/work
      cfssl gencert -ca=/opt/k8s/work/ca.pem \
        -ca-key=/opt/k8s/work/ca-key.pem \
        -config=/opt/k8s/work/ca-config.json \
        -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy
      
      ls kube-proxy*pem
      
    3. 安装证书

      cd /opt/k8s/work
      export node_ip=192.168.0.114
      scp kube-proxy*.pem root@${node_ip}:/etc/kubernetes/cert/
      
  2. 创建 kubeconfig 文件

    • kube-proxy 使用此文件访问apiserver,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-proxy证书等信息
    cd /opt/k8s/work
    
    export KUBE_APISERVER=https://192.168.0.107:6443
    
    kubectl config set-cluster kubernetes \
      --certificate-authority=/opt/k8s/work/ca.pem \
      --embed-certs=true \
      --server=${KUBE_APISERVER}  \
      --kubeconfig=kube-proxy.kubeconfig
    
    kubectl config set-credentials kube-proxy \
      --client-certificate=kube-proxy.pem \
      --client-key=kube-proxy-key.pem \
      --embed-certs=true \
      --kubeconfig=kube-proxy.kubeconfig
    
    kubectl config set-context default \
      --cluster=kubernetes \
      --user=kube-proxy \
      --kubeconfig=kube-proxy.kubeconfig
    
    kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
    
  3. 分发 kubeconfig

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kube-proxy.kubeconfig root@${node_ip}:/etc/kubernetes/kube-proxy.kubeconfig
    
  4. 创建 kube-proxy 配置文件

    cd /opt/k8s/work
    
    export CLUSTER_CIDR="172.30.0.0/16"
    
    export NODE_IP=192.168.0.114
    
    export NODE_NAME=slave
    
    cat > kube-proxy-config.yaml <<EOF
    kind: KubeProxyConfiguration
    apiVersion: kubeproxy.config.k8s.io/v1alpha1
    clientConnection:
      burst: 200
      kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig"
      qps: 100
    bindAddress: ${NODE_IP}
    healthzBindAddress: ${NODE_IP}:10256
    metricsBindAddress: ${NODE_IP}:10249
    enableProfiling: true
    clusterCIDR: ${CLUSTER_CIDR}
    hostnameOverride: ${NODE_NAME}
    mode: "ipvs"
    portRange: ""
    iptables:
      masqueradeAll: false
    ipvs:
      scheduler: rr
      excludeCIDRs: []
    EOF 
    
    • bindAddress: 监听地址;
    • clientConnection.kubeconfig: 连接 apiserver 的 kubeconfig 文件;
    • clusterCIDR: kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr 或 --masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
    • hostnameOverride: 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 ipvs 规则;
    • mode: 使用 ipvs 模式;
  5. 分发kube-proxy 配置文件

    cd /opt/k8s/work
    export node_ip=192.168.0.114
    scp kube-proxy-config.yaml root@${node_ip}:/etc/kubernetes/kube-proxy-config.yaml
    
  6. 创建kube-proxy服务启动文件

    cd /opt/k8s/work
    export K8S_DIR=/data/k8s/k8s
    
    cat > kube-proxy.service <<EOF
    [Unit]
    Description=Kubernetes Kube-Proxy Server
    Documentation=https://github.com/GoogleCloudPlatform/kubernetes
    After=network.target
    
    [Service]
    WorkingDirectory=${K8S_DIR}/kube-proxy
    ExecStart=/opt/k8s/bin/kube-proxy \\
      --config=/etc/kubernetes/kube-proxy-config.yaml \\
      --logtostderr=true \\
      --v=2
    Restart=on-failure
    RestartSec=5
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    EOF
    
  7. 分发 kube-proxy服务启动文件:

    export node_ip=192.168.0.114
    scp kube-proxy.service root@${node_ip}:/etc/systemd/system/
    
  8. 启动 kube-proxy服务

    export node_ip=192.168.0.114
    export K8S_DIR=/data/k8s/k8s
    
    ssh root@${node_ip} "mkdir -p ${K8S_DIR}/kube-proxy"
    ssh root@${node_ip} "modprobe ip_vs_rr"
    ssh root@${node_ip} "systemctl daemon-reload && systemctl enable kube-proxy && systemctl restart kube-proxy"
    
  9. 检查启动结果

    export node_ip=192.168.0.114
    ssh root@${node_ip} "systemctl status kube-proxy  |grep Active"
    • 确保状态为 active (running),否则查看日志,确认原因
    • 如果出现异常,通过如下命令查看

      journalctl -u kube-proxy
      
  10. 查看状态

    
    root@slave:~# netstat -lnpt|grep kube-prox
    tcp        0      0 192.168.0.114:10256     0.0.0.0:*               LISTEN      23078/kube-proxy
    tcp        0      0 192.168.0.114:10249     0.0.0.0:*               LISTEN      23078/kube-proxy
    root@slave:~# ipvsadm -ln
    IP Virtual Server version 1.2.1 (size=4096)
    Prot LocalAddress:Port Scheduler Flags
      -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
    TCP  10.254.0.1:443 rr
      -> 192.168.0.107:6443           Masq    1      0          0         
    

验证集群功能(在master节点上执行)

以一个nginx的service和deployment来验证集群功能

  1. 创建启动文件

    mkdir /opt/k8s/yml
    
    cd /opt/k8s/yml
    
    cat > nginx.yml << EOF
    apiVersion: v1
    kind: Service
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      type: NodePort
      selector:
        app: nginx
      ports:
      - name: http
        port: 80
        targetPort: 80
        nodePort: 8080
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: nginx-deployment
    spec:
      selector:
        matchLabels:
          app: nginx
      replicas: 1
      template:
        metadata:
          labels:
            app: nginx
        spec:
          containers:
          - name: nginx
            image: nginx:1.9.1
            ports:
            - containerPort: 80
    EOF
    
  2. 启动服务

    kubectl create -f nginx.yml
    
    • 第一次启动时需要下载k8s.gcr.io/pause:3.1镜像,国内无法直接下载,造成服务无法启动,通过下面操作来解决

      docker pull kubeimage/pause:3.1
      docker tag kubeimage/pause:3.1 k8s.gcr.io/pause:3.1 
      
  3. 观察服务启动情况

    
    root@master:/opt/k8s/yml# kubectl get service -o wide
    NAME         TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)       AGE   SELECTOR
    kubernetes   ClusterIP   10.254.0.1    <none>        443/TCP       41h   <none>
    nginx        NodePort    10.254.8.25   <none>        80:8080/TCP   30m   app=nginx
    root@master:/opt/k8s/yml# kubectl get pod -o wide
    NAME                                READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
    nginx-deployment-56f8998dbc-955gf   1/1     Running   0          30m   172.30.78.2   slave   <none>           <none>
    root@master:/opt/k8s/yml# curl http://192.168.0.114:8080
    <!DOCTYPE html>
    <html>
    <head>
    <title>Welcome to nginx!</title>
    <style>
        body {
            width: 35em;
            margin: 0 auto;
            font-family: Tahoma, Verdana, Arial, sans-serif;
        }
    </style>
    </head>
    <body>
    <h1>Welcome to nginx!</h1>
    <p>If you see this page, the nginx web server is successfully installed and
    working. Further configuration is required.</p>
    
    <p>For online documentation and support please refer to
    <a href="http://nginx.org/">nginx.org</a>.<br/>
    Commercial support is available at
    <a href="http://nginx.com/">nginx.com</a>.</p>
    
    <p><em>Thank you for using nginx.</em></p>
    </body>
    </html>
    

部署 coredns 插件(在master节点上执行)

  1. 下载和配置 coredns

    cd /opt/k8s/work
    git clone https://github.com/coredns/deployment.git
    mv deployment coredns
    
  2. 启动 coredns

    cd /opt/k8s/work/coredns/kubernetes
    
    export CLUSTER_DNS_SVC_IP="10.254.0.2"
    export CLUSTER_DNS_DOMAIN="cluster.local"
    
    ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -
  3. 遇到问题

    启动coredns后,状态是CrashLoopBackOff

    root@master:/opt/k8s/work/coredns/kubernetes# kubectl get pod -n kube-system -l k8s-app=kube-dns
    NAME                      READY   STATUS             RESTARTS   AGE
    coredns-76b74f549-99bxd   0/1     CrashLoopBackOff   5          4m45s

    查看coredns对应的pod日志有如下错误

    root@master:/opt/k8s/work/coredns/kubernetes# kubectl -n kube-system logs coredns-76b74f549-99bxd
    .:53
    [INFO] plugin/reload: Running configuration MD5 = 8b19e11d5b2a72fb8e63383b064116a1
    CoreDNS-1.6.6
    linux/amd64, go1.13.5, 6a7a75e
    [FATAL] plugin/loop: Loop (127.0.0.1:60429 -> :53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 6292641803451309721.7599235642583168995."
    
    

    按照提示进入https://coredns.io/plugins/loop#troubleshooting页面,有如下表述

    When a CoreDNS Pod deployed in Kubernetes detects a loop, the CoreDNS Pod will start to “CrashLoopBackOff”. This is because Kubernetes will try to restart the Pod every time CoreDNS detects the loop and exits.
    A common cause of forwarding loops in Kubernetes clusters is an interaction with a local DNS cache on the host node (e.g. systemd-resolved). For example, in certain configurations systemd-resolved will put the loopback address 127.0.0.53 as a nameserver into /etc/resolv.conf. Kubernetes (via kubelet) by default will pass this /etc/resolv.conf file to all Pods using the default dnsPolicy rendering them unable to make DNS lookups (this includes CoreDNS Pods). CoreDNS uses this /etc/resolv.conf as a list of upstreams to forward requests to. Since it contains a loopback address, CoreDNS ends up forwarding requests to itself.
    There are many ways to work around this issue, some are listed here:
    • Add the following to your kubelet config yaml: resolvConf: (or via command line flag --resolv-conf deprecated in 1.10). Your “real” resolv.conf is the one that contains the actual IPs of your upstream servers, and no local/loopback address. This flag tells kubelet to pass an alternate resolv.conf to Pods. For systems using systemd-resolved, /run/systemd/resolve/resolv.conf is typically the location of the “real” resolv.conf, although this can be different depending on your distribution.
    • Disable the local DNS cache on host nodes, and restore /etc/resolv.conf to the original.
    • A quick and dirty fix is to edit your Corefile, replacing forward . /etc/resolv.conf with the IP address of your upstream DNS, for example forward . 8.8.8.8. But this only fixes the issue for CoreDNS, kubelet will continue to forward the invalid resolv.conf to all default dnsPolicy Pods, leaving them unable to resolve DNS.

    按照提示的第一种解决方法,修改kubelet对应的配置文件kubelet-config.yaml中resolv-conf的值为/run/systemd/resolve/resolv.conf,配置片段如下

    ...
    
    podPidsLimit: -1
    resolvConf: /run/systemd/resolve/resolv.conf
    maxOpenFiles: 1000000  
    
    ...
    

    重启kubelet服务

    systemctl daemon-reload
    systemctl restart kubelet

    之后重新部署coredns

    
    root@master:/opt/k8s/work/coredns/kubernetes# ./deploy.sh -i ${CLUSTER_DNS_SVC_IP} -d ${CLUSTER_DNS_DOMAIN} | kubectl apply -f -
    serviceaccount/coredns created
    clusterrole.rbac.authorization.k8s.io/system:coredns created
    clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
    configmap/coredns created
    deployment.apps/coredns created
    service/kube-dns created
    
    root@master:/opt/k8s/work/coredns/kubernetes# kubectl get pod -A
    NAMESPACE     NAME                      READY   STATUS    RESTARTS   AGE
    kube-system   coredns-76b74f549-j5t9c   1/1     Running   0          12s
    
    root@master:/opt/k8s/work/coredns/kubernetes# kubectl get all -n kube-system  -l k8s-app=kube-dns
    NAME                          READY   STATUS    RESTARTS   AGE
    pod/coredns-76b74f549-j5t9c   1/1     Running   0          2m8s
    
    NAME               TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
    service/kube-dns   ClusterIP   10.254.0.2   <none>        53/UDP,53/TCP,9153/TCP   2m8s
    
    NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/coredns   1/1     1            1           2m8s
    
    NAME                                DESIRED   CURRENT   READY   AGE
    replicaset.apps/coredns-76b74f549   1         1         1       2m8s
    
  4. 启动一个busybox pod,并启动上一章节中验证集群功能的nginx服务,在busybox通过服务名,访问nginx服务

    cd /opt/k8s/yml
    cat > busybox.yml << EOF
    apiVersion: v1
    kind: Pod
    metadata:
      name: busybox
    spec:
      containers:
      - name: busybox
        image: busybox
        command:
          - sleep
          - "3600"
    EOF
    
    
    kubectl create -f busybox.yml
    
    kubectl create -f nginx.yml
    
  5. 进入busybox pod中访问nginx

    root@master:/opt/k8s/yml# kubectl exec -it busybox  sh
    / # cat /etc/resolv.conf
    nameserver 10.254.0.2
    search default.svc.cluster.local svc.cluster.local cluster.local
    options ndots:5 
    
    
    / # nslookup www.baidu.com
    Server:         10.254.0.2
    Address:        10.254.0.2:53
    
    Non-authoritative answer:
    www.baidu.com   canonical name = www.a.shifen.com
    Name:   www.a.shifen.com
    Address: 183.232.231.174
    Name:   www.a.shifen.com
    Address: 183.232.231.172 
    
    
    / # nslookup kubernetes
    Server:         10.254.0.2
    Address:        10.254.0.2:53
    
    Name:   kubernetes.default.svc.cluster.local
    Address: 10.254.0.1
    
    
    
    / # nslookup nginx
    Server:         10.254.0.2
    Address:        10.254.0.2:53
    
    Name:   nginx.default.svc.cluster.local
    Address: 10.254.19.32
    
    / # ping -c 1 nginx
    PING nginx (10.254.19.32): 56 data bytes
    64 bytes from 10.254.19.32: seq=0 ttl=64 time=0.155 ms
    
    --- nginx ping statistics ---
    1 packets transmitted, 1 packets received, 0% packet loss
    round-trip min/avg/max = 0.155/0.155/0.155 ms
    

猜你喜欢

转载自www.cnblogs.com/gaofeng-henu/p/12296097.html