闲聊Kubernetes Pod垂直自动伸缩（VPA）

scofield 菜鸟运维杂谈

VPA 简介

VPA全称Vertical Pod Autoscaler，即垂直 Pod 自动扩缩容，它根据容器资源使用率自动设置 CPU 和内存的requests，从而允许在节点上进行适当的调度，以便为每个 Pod 提供适当的资源。
它既可以缩小过度请求资源的容器，也可以根据其使用情况随时提升资源不足的容量。
PS: VPA不会改变Pod的资源limits值。

废话不多说，直接上图，看VPA工作流程

接下来开始实战

部署metrics-server

1、下载部署清单文件


wget  https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml

2、修改components.yaml文件

修改了镜像地址，gcr.io为我自己的仓库
修改了metrics-server启动参数args，要不然会报错unable to fully scrape metrics from source kubelet_summary…


- name: metrics-server
        image: scofield/metrics-server:v0.3.7
        imagePullPolicy: IfNotPresent
        args:
          - --cert-dir=/tmp
          - --secure-port=4443
          - /metrics-server
          - --kubelet-insecure-tls
          - --kubelet-preferred-address-types=InternalIP

3、执行部署


kubectl  apply -f components.yaml

4、验证


[root@k8s-node001 metrics-server]# kubectl  get po -n kube-system
NAME                                       READY   STATUS    RESTARTS   AGE
metrics-server-7947cb98b6-xw6b8            1/1     Running   0          10m

能获取要top信息视为成功
[root@k8s-node001 metrics-server]# kubectl  top nodes
NAME          CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
k8s-node001   618m         7%       4796Mi           15%
k8s-node003   551m         6%       5522Mi           17%
k8s-node004   308m         3%       5830Mi           18%
k8s-node005   526m         6%       5997Mi           38%
k8s-node002   591m         7%       5306Mi           33%

部署vertical-pod-autoscaler

1、克隆autoscaler项目


git clone https://github.com/kubernetes/autoscaler.git

2、修改部署文件，将gcr仓库改为我自己的仓库

扫描二维码关注公众号，回复： 12736353 查看本文章


admission-controller-deployment.yaml
us.gcr.io/k8s-artifacts-prod/autoscaling/vpa-admission-controller:0.8.0
改为
scofield/vpa-admission-controller:0.8.0

recommender-deployment.yaml
us.gcr.io/k8s-artifacts-prod/autoscaling/vpa-recommender:0.8.0
改为
image: scofield/vpa-recommender:0.8.0

updater-deployment.yaml
us.gcr.io/k8s-artifacts-prod/autoscaling/vpa-updater:0.8.0
改为
scofield/vpa-updater:0.8.0

3、部署


[root@k8s-node001 vertical-pod-autoscaler]#  cd autoscaler/vertical-pod-autoscaler
[root@k8s-node001 vertical-pod-autoscaler]#  ./hack/vpa-up.sh
Warning: apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/verticalpodautoscalers.autoscaling.k8s.io created
customresourcedefinition.apiextensions.k8s.io/verticalpodautoscalercheckpoints.autoscaling.k8s.io created
clusterrole.rbac.authorization.k8s.io/system:metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:vpa-actor created
clusterrole.rbac.authorization.k8s.io/system:vpa-checkpoint-actor created
clusterrole.rbac.authorization.k8s.io/system:evictioner created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-actor created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-checkpoint-actor created
clusterrole.rbac.authorization.k8s.io/system:vpa-target-reader created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-target-reader-binding created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-evictionter-binding created
serviceaccount/vpa-admission-controller created
clusterrole.rbac.authorization.k8s.io/system:vpa-admission-controller created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-admission-controller created
clusterrole.rbac.authorization.k8s.io/system:vpa-status-reader created
clusterrolebinding.rbac.authorization.k8s.io/system:vpa-status-reader-binding created
serviceaccount/vpa-updater created
deployment.apps/vpa-updater created
serviceaccount/vpa-recommender created
deployment.apps/vpa-recommender created
Generating certs for the VPA Admission Controller in /tmp/vpa-certs.
Generating RSA private key, 2048 bit long modulus (2 primes)
............................................................................+++++
.+++++
e is 65537 (0x010001)
Generating RSA private key, 2048 bit long modulus (2 primes)
............+++++
...........................................................................+++++
e is 65537 (0x010001)
Signature ok
subject=CN = vpa-webhook.kube-system.svc
Getting CA Private Key
Uploading certs to the cluster.
secret/vpa-tls-certs created
Deleting /tmp/vpa-certs.
deployment.apps/vpa-admission-controller created
service/vpa-webhook created

4、查看结果，可以看到metrics-server和vpa都已经正常运行了


[root@k8s-node001 autoscaler-master]# kubectl  get po -n kube-system
NAME                                        READY   STATUS    RESTARTS   AGE
metrics-server-7947cb98b6-xw6b8             1/1     Running   0          46m
vpa-admission-controller-7d87559549-g77h9   1/1     Running   0          10m
vpa-recommender-84bf7fb9db-65669            1/1     Running   0          10m
vpa-updater-79cc46c7bb-5p889                1/1     Running   0          10m

示例1 updateMode: "Off"

1、首先我们部署一个nginx服务,部署到namespace: vpa中


apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
  namespace: vpa
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx
        name: nginx
        resources:
          requests:
            cpu: 100m
            memory: 250Mi

看下结果，正常运行了2个pod


[root@k8s-node001 examples]# kubectl  get po -n vpa
NAME                         READY   STATUS    RESTARTS   AGE
pod/nginx-6884b849f7-fswx5   1/1     Running   0          5m54s
pod/nginx-6884b849f7-wz6b8   1/1     Running   0          5m54s

2、为了便宜压测，我们创建一个NodePort类型的service


[root@k8s-node001 examples]# cat  nginx-vpa-ingress.yaml
apiVersion: v1
kind: Service
metadata:
  name: nginx
  namespace: vpa
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: nginx

[root@k8s-node001 examples]# kubectl  get svc -n vpa
NAME    TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
nginx   NodePort   10.97.250.131   <none>        80:32621/TCP   55s

[root@k8s-node001 examples]# curl -I   192.168.100.185:32621
HTTP/1.1 200 OK

3、创建VPA
这里先使用updateMode: "Off"模式，这种模式仅获取资源推荐不更新Pod


[root@k8s-node001 examples]# cat   nginx-vpa-demo.yaml
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
  name: nginx-vpa
  namespace: vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: nginx
  updatePolicy:
    updateMode: "Off"
  resourcePolicy:
    containerPolicies:
    - containerName: "nginx"
      minAllowed:
        cpu: "250m"
        memory: "100Mi"
      maxAllowed:
        cpu: "2000m"
        memory: "2048Mi"

4、查看部署结果


[root@k8s-node001 examples]# kubectl  get vpa -n vpa
NAME        AGE
nginx-vpa   2m34s

5、使用describe查看vpa详情，主要关注Container Recommendations


[root@k8s-node001 examples]# kubectl  describe  vpa nginx-vpa   -n vpa
Name:         nginx-vpa
Namespace:    vpa
....略去10000字 哈哈......
  Update Policy:
    Update Mode:  Off
Status:
  Conditions:
    Last Transition Time:  2020-09-28T04:04:25Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
    Container Recommendations:
      Container Name:  nginx
      Lower Bound:
        Cpu:     250m
        Memory:  262144k
      Target:
        Cpu:     250m
        Memory:  262144k
      Uncapped Target:
        Cpu:     25m
        Memory:  262144k
      Upper Bound:
        Cpu:     803m
        Memory:  840190575
Events:          <none>

其中


Lower Bound:                 下限值
Target:                              推荐值
Upper Bound:                 上限值
Uncapped Target:           如果没有为VPA提供最小或最大边界，则表示目标利用率
上述结果表明，推荐的 Pod 的 CPU 请求为 25m，推荐的内存请求为 262144k 字节。

6、现在我们对nginx进行压测
执行压测命令


[root@k8s-node001 examples]# ab -c 100 -n 10000000 http://192.168.100.185:32621/
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.100.185 (be patient)
Completed 1000000 requests
Completed 2000000 requests
Completed 3000000 requests

7、几分钟后再观察VPA Recommendation变化


[root@k8s-node001 ~]# kubectl  describe  vpa nginx-vpa   -n vpa |tail -n 20 
  Conditions:
    Last Transition Time:  2020-09-28T04:04:25Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
    Container Recommendations:
      Container Name:  nginx
      Lower Bound:
        Cpu:     250m
        Memory:  262144k
      Target:
        Cpu:     476m
        Memory:  262144k
      Uncapped Target:
        Cpu:     476m
        Memory:  262144k
      Upper Bound:
        Cpu:     2
        Memory:  387578728
Events:          <none>

从输出信息可以看出，VPA对Pod给出了推荐值：Cpu: 476m，因为我们这里设置了updateMode: "Off"，所以不会更新Pod

示例2 updateMode: "Auto"

1、现在我把updateMode: "Auto"，看看VPA会有什么动作
这里我把resources改为：memory: 50Mi，cpu: 100m


[root@k8s-node001 examples]# kubectl  apply -f nginx-vpa.yaml
deployment.apps/nginx created
[root@k8s-node001 examples]# cat nginx-vpa.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
  namespace: vpa
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: nginx
        name: nginx
        resources:
          requests:
            cpu: 100m
            memory: 50Mi

[root@k8s-node001 examples]# kubectl  get po  -n vpa
NAME                     READY   STATUS    RESTARTS   AGE
nginx-7ff65f974c-f4vgl   1/1     Running   0          114s
nginx-7ff65f974c-v9ccx   1/1     Running   0          114s

2、再次部署vpa,这里VPA部署文件nginx-vpa-demo.yaml只改了u
pdateMode: "Auto"和name: nginx-vpa-2


[root@k8s-node001 examples]# cat  nginx-vpa-demo.yaml
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
  name: nginx-vpa-2
  namespace: vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: nginx
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: "nginx"
      minAllowed:
        cpu: "250m"
        memory: "100Mi"
      maxAllowed:
        cpu: "2000m"
        memory: "2048Mi"

[root@k8s-node001 examples]# kubectl apply -f nginx-vpa-demo.yaml
verticalpodautoscaler.autoscaling.k8s.io/nginx-vpa created

[root@k8s-node001 examples]# kubectl  get vpa -n vpa
NAME        AGE
nginx-vpa-2   9s

3、再次压测


ab -c 1000 -n 100000000 http://192.168.100.185:32621/

4、几分钟后，使用describe查看vpa详情，同样只关注Container Recommendations


[root@k8s-node001 ~]# kubectl  describe  vpa nginx-vpa-2    -n vpa |tail -n 30
      Min Allowed:
        Cpu:     250m
        Memory:  100Mi
  Target Ref:
    API Version:  apps/v1
    Kind:         Deployment
    Name:         nginx
  Update Policy:
    Update Mode:  Auto
Status:
  Conditions:
    Last Transition Time:  2020-09-28T04:48:25Z
    Status:                True
    Type:                  RecommendationProvided
  Recommendation:
    Container Recommendations:
      Container Name:  nginx
      Lower Bound:
        Cpu:     250m
        Memory:  262144k
      Target:
        Cpu:     476m
        Memory:  262144k
      Uncapped Target:
        Cpu:     476m
        Memory:  262144k
      Upper Bound:
        Cpu:     2
        Memory:  262144k
Events:          <none>

Target变成了Cpu: 587m ，Memory: 262144k

5、来看下event事件


[root@k8s-node001 ~]# kubectl  get event -n vpa
LAST SEEN   TYPE      REASON              OBJECT                        MESSAGE
33m         Normal    Pulling             pod/nginx-7ff65f974c-f4vgl    Pulling image "nginx"
33m         Normal    Pulled              pod/nginx-7ff65f974c-f4vgl    Successfully pulled image "nginx" in 15.880996269s
33m         Normal    Created             pod/nginx-7ff65f974c-f4vgl    Created container nginx
33m         Normal    Started             pod/nginx-7ff65f974c-f4vgl    Started container nginx
26m         Normal    EvictedByVPA        pod/nginx-7ff65f974c-f4vgl    Pod was evicted by VPA Updater to apply resource recommendation.
26m         Normal    Killing             pod/nginx-7ff65f974c-f4vgl    Stopping container nginx
35m         Normal    Scheduled           pod/nginx-7ff65f974c-hnzr5    Successfully assigned vpa/nginx-7ff65f974c-hnzr5 to k8s-node005
35m         Normal    Pulling             pod/nginx-7ff65f974c-hnzr5    Pulling image "nginx"
34m         Normal    Pulled              pod/nginx-7ff65f974c-hnzr5    Successfully pulled image "nginx" in 40.750855715s
34m         Normal    Scheduled           pod/nginx-7ff65f974c-v9ccx    Successfully assigned vpa/nginx-7ff65f974c-v9ccx to k8s-node004
33m         Normal    Pulling             pod/nginx-7ff65f974c-v9ccx    Pulling image "nginx"
33m         Normal    Pulled              pod/nginx-7ff65f974c-v9ccx    Successfully pulled image "nginx" in 15.495315629s
33m         Normal    Created             pod/nginx-7ff65f974c-v9ccx    Created container nginx
33m         Normal    Started             pod/nginx-7ff65f974c-v9ccx    Started container nginx

从输出信息可以了解到，vpa执行了EvictedByVPA，自动停掉了nginx，然后使用 VPA推荐的资源启动了新的nginx
，我们查看下nginx的pod可以得到确认


[root@k8s-node001 ~]# kubectl  describe po nginx-7ff65f974c-2m9zl -n vpa
Name:         nginx-7ff65f974c-2m9zl
Namespace:    vpa
Priority:     0
Node:         k8s-node004/192.168.100.184
Start Time:   Mon, 28 Sep 2020 00:46:19 -0400
Labels:       app=nginx
              pod-template-hash=7ff65f974c
Annotations:  cni.projectcalico.org/podIP: 100.67.191.53/32
              vpaObservedContainers: nginx
              vpaUpdates: Pod resources updated by nginx-vpa: container 0: cpu request, memory request
Status:       Running
IP:           100.67.191.53
IPs:
  IP:           100.67.191.53
Controlled By:  ReplicaSet/nginx-7ff65f974c
Containers:
  nginx:
    Container ID:   docker://c96bcd07f35409d47232a0bf862a76a56352bd84ef10a95de8b2e3f6681df43d
    Image:          nginx
    Image ID:       docker-pullable://nginx@sha256:c628b67d21744fce822d22fdcc0389f6bd763daac23a6b77147d0712ea7102d0
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 28 Sep 2020 00:46:38 -0400
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:        476m
      memory:     262144k

看重点Requests：cpu: 476m，memory: 262144k
再回头看看部署文件

      requests:
        cpu: 100m
        memory: 50Mi

现在可以知道VPA做了哪些事了吧。当然，随着服务的负载的变化，VPA的推荐之后也会不断变化。当目前运行的pod的资源达不到VPA的推荐值，就会执行pod驱逐，重新部署新的足够资源的服务。

VPA使用限制

不能与HPA（Horizontal Pod Autoscaler ）一起使用
Pod比如使用副本控制器，例如属于Deployment或者StatefulSet
VPA有啥好处

Pod 资源用其所需，所以集群节点使用效率高。
Pod 会被安排到具有适当可用资源的节点上。
不必运行基准测试任务来确定 CPU 和内存请求的合适值。
VPA 可以随时调整 CPU 和内存请求，无需人为操作，因此可以减少维护时间。
最后滴最后，VPA是Kubernetes比较新的功能，还没有在生产环境大规模实践过，不建议在线上环境使用自动更新模式，但是使用推荐模式你可以更好了解服务的资源使用情况。
更多信息请前往官网查看

PS：后续文章会同步到dev.kubeops.net

注：文中图片来源于网络，如有侵权，请联系我及时删除。