调度策略
Predicate(预选) —> Priority(优选) —> Select
预选策略
- CheckNodeCondition: 节点状态是否正常
- GeneralPredicates
-
HostName:检查Pod对象是否定义了pod.spec.hostname,
-
PodFitsHostPorts:pods.spec.containers.ports.hostPort
-
MatchNodeSelector:pods.spec.nodeSelector
-
PodFitsResources:检查Pod的资源需求是否能被节点所满足;
-
-
NoDiskConflict: 检查Pod依赖的存储卷是否能满足需求;
-
PodToleratesNodeTaints:检查Pod上的spec.tolerations可容忍的污点是否完全包含节点上的污点;
-
PodToleratesNodeNoExecuteTaints(默认未启用): 任何时候发现节点污点都不能容忍污点而驱离pod
-
CheckNodeLabelPresence(默认未启用):检查标签的存在性
-
CheckServiceAffinity(默认未启用):亲和性,相关的pod放到一起
- MaxEBSVolumeCount
- MaxGCEPDVolumeCount
- MaxAzureDiskVolumeCount
- CheckVolumeBinding: 检查pvc是否bonding
- NoVolumeZoneConflict:
- CheckNodeMemoryPressure : 检查内存资源
- CheckNodePIDPressure : 检查pid资源
- CheckNodeDiskPressure: 检查磁盘资源
- MatchInterPodAffinity: pod亲和性或反亲和性
优先函数
https://github.com/kubernetes/kubernetes/tree/master/pkg/scheduler/algorithm/priorities
每项得分求和,最高的胜出
- LeastRequested:请求最少的(空闲最多的)
(cpu((capacity-sum(requested))*10/capacity)+memory((capacity-sum(requested))*10/capacity))/2
-
BalancedResourceAllocation: CPU和内存资源被占用率相近的胜出
-
NodePreferAvoidPods: 节点注解信息"scheduler.alpha.kubernetes.io/preferAvoidPods"
-
TaintToleration:将Pod对象的spec.tolerations列表项与节点的taints列表项进行匹配度检查,匹配条目越,
得分越低; -
SeletorSpreading:标签选择器分散度(分散到更多的node)
-
InterPodAffinity: 亲和性高的
-
NodeAffinity: 节点亲和调度
-
MostRequested(默认未启用):
-
NodeLabel(默认未启用):标签越多的
-
ImageLocality(默认未启用):根据满足当前Pod对象需求的已有镜像的体积大小之和
节点选择器:nodeSelector, nodeName
节点亲和调度:nodeAffinity
kubectl explain pod.spec.nodeSelector
kubectl explain pod.spec.affinity
kubectl explain pod.spec.affinity.nodeAffinity
kubectl explain pod.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms.matchFields
pod-daemon.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-demo
namespace: default
labels:
app: myapp
tier: frontend
annotations:
wuxing.com/created-by: "cluster admin"
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
nodeSelector:
disktype: ssd
node打标签
kubectl label nodes 10.0.0.12 disktype=harddisk --overwrite
kubectl explain pod.spec.affinity.nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms.matchExpressions
KIND: Pod
VERSION: v1
RESOURCE: matchExpressions <[]Object>
DESCRIPTION:
A list of node selector requirements by node's labels.
A node selector requirement is a selector that contains values, a key, and
an operator that relates the key and values.
FIELDS:
key <string> -required-
The label key that the selector applies to.
operator <string> -required-
Represents a key's relationship to a set of values. Valid operators are In,
NotIn, Exists, DoesNotExist. Gt, and Lt.
values <[]string>
An array of string values. If the operator is In or NotIn, the values array
must be non-empty. If the operator is Exists or DoesNotExist, the values
array must be empty. If the operator is Gt or Lt, the values array must
have a single element, which will be interpreted as an integer. This array
is replaced during a strategic merge patch.
pod-nodeaffinity-demo.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-node-affinity-demo
labels:
app: myapp
tier: frontend
annotations:
wuxing.com/created-by: "cluster admin"
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: zone
operator: In
values:
- foo
- bar
[root@k8s-master1 schedule]# kubectl explain pod.spec.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution
KIND: Pod
VERSION: v1
RESOURCE: preferredDuringSchedulingIgnoredDuringExecution <[]Object>
DESCRIPTION:
The scheduler will prefer to schedule pods to nodes that satisfy the
affinity expressions specified by this field, but it may choose a node that
violates one or more of the expressions. The node that is most preferred is
the one with the greatest sum of weights, i.e. for each node that meets all
of the scheduling requirements (resource request, requiredDuringScheduling
affinity expressions, etc.), compute a sum by iterating through the
elements of this field and adding "weight" to the sum if the node matches
the corresponding matchExpressions; the node(s) with the highest sum are
the most preferred.
An empty preferred scheduling term matches all objects with implicit weight
0 (i.e. it's a no-op). A null preferred scheduling term matches no objects
(i.e. is also a no-op).
FIELDS:
preference <Object> -required-
A node selector term, associated with the corresponding weight.
weight <integer> -required-
Weight associated with matching the corresponding nodeSelectorTerm, in the
range 1-100.
[root@k8s-master1 schedule]# kubectl explain pod.spec.affinity.nodeAffinity.preferredDuringSchedulingIgnoredDuringExecution.preference
KIND: Pod
VERSION: v1
RESOURCE: preference <Object>
DESCRIPTION:
A node selector term, associated with the corresponding weight.
A null or empty node selector term matches no objects. The requirements of
them are ANDed. The TopologySelectorTerm type implements a subset of the
NodeSelectorTerm.
FIELDS:
matchExpressions <[]Object>
A list of node selector requirements by node's labels.
matchFields <[]Object>
A list of node selector requirements by node's fields.
pod-nodeaffinity-demo-2.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-node-affinity-demo-2
labels:
app: myapp
tier: frontend
annotations:
wuxing.com/created-by: "cluster admin"
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- preference:
matchExpressions:
- key: zone
operator: In
values:
- foo
- bar
weight: 60
kubectl explain pods.spec.affinity.podAffinity
运行在相同node
pod-required-affinity-demo.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-first
labels:
app: myapp
tier: frontend
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
---
apiVersion: v1
kind: Pod
metadata:
name: pod-second
labels:
app: backend
tier: db
spec:
containers:
- name: busybox
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh","-c","sleep 3600"]
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["myapp"]}
topologyKey: kubernetes.io/hostname
kubectl label nodes 10.0.0.12 zone=foo
kubectl label nodes 10.0.0.13 zone=foo
pod-required-anti-affinity-demo.yaml
apiVersion: v1
kind: Pod
metadata:
name: pod-first
labels:
app: myapp
tier: frontend
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
---
apiVersion: v1
kind: Pod
metadata:
name: pod-second
labels:
app: backend
tier: db
spec:
containers:
- name: busybox
image: busybox:latest
imagePullPolicy: IfNotPresent
command: ["sh","-c","sleep 3600"]
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- {key: app, operator: In, values: ["myapp"]}
topologyKey: zone
污点
用于节点,拒绝pod(拒绝不能容忍节点污点的pod)
taint的effect定义对Pod排斥效果:
- NoSchedule:仅影响调度过程,对现存的Pod对象不产生影响;
- NoExecute:既影响调度过程,也影响现在的Pod对象;不容忍的Pod对象将被驱逐;
- PreferNoSchedule: 最好不调度到对应节点
kubectl get nodes 10.0.0.12 -o yaml
apiVersion: v1
kind: Node
metadata:
annotations:
node.alpha.kubernetes.io/ttl: "0"
volumes.kubernetes.io/controller-managed-attach-detach: "true"
creationTimestamp: "2019-05-31T04:01:08Z"
labels:
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/os: linux
disktype: harddisk
kubernetes.io/hostname: 10.0.0.12
zone: foo
name: 10.0.0.12
resourceVersion: "988627"
selfLink: /api/v1/nodes/10.0.0.12
uid: b873807a-8358-11e9-8680-000c29b4d624
spec: {}
status:
addresses:
- address: 10.0.0.12
type: InternalIP
- address: 10.0.0.12
type: Hostname
allocatable:
cpu: "2"
ephemeral-storage: "46362057447"
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 3926032Ki
pods: "110"
capacity:
cpu: "2"
ephemeral-storage: 50306052Ki
hugepages-1Gi: "0"
hugepages-2Mi: "0"
memory: 4028432Ki
pods: "110"
conditions:
- lastHeartbeatTime: "2019-12-28T03:32:33Z"
lastTransitionTime: "2019-06-02T06:09:48Z"
message: kubelet has sufficient memory available
reason: KubeletHasSufficientMemory
status: "False"
type: MemoryPressure
- lastHeartbeatTime: "2019-12-28T03:32:33Z"
lastTransitionTime: "2019-06-02T06:09:48Z"
message: kubelet has no disk pressure
reason: KubeletHasNoDiskPressure
status: "False"
type: DiskPressure
- lastHeartbeatTime: "2019-12-28T03:32:33Z"
lastTransitionTime: "2019-06-02T06:09:48Z"
message: kubelet has sufficient PID available
reason: KubeletHasSufficientPID
status: "False"
type: PIDPressure
- lastHeartbeatTime: "2019-12-28T03:32:33Z"
lastTransitionTime: "2019-06-02T06:09:48Z"
message: kubelet is posting ready status
reason: KubeletReady
status: "True"
type: Ready
- lastHeartbeatTime: "2019-05-31T04:01:08Z"
lastTransitionTime: "2019-05-31T04:34:48Z"
message: Kubelet never posted node status.
reason: NodeStatusNeverUpdated
status: Unknown
type: OutOfDisk
daemonEndpoints:
kubeletEndpoint:
Port: 10250
images:
- names:
- 10.0.0.11:5000/wuxingge/nginx-ingress-controller@sha256:a944178876522b97abc7206cb3b2231f4893dc05ffd0e95bd65ea5aeabcc7060
- registry.cn-hangzhou.aliyuncs.com/wuxingge/nginx-ingress-controller@sha256:a944178876522b97abc7206cb3b2231f4893dc05ffd0e95bd65ea5aeabcc7060
- 10.0.0.11:5000/wuxingge/nginx-ingress-controller:0.21.0
- registry.cn-hangzhou.aliyuncs.com/wuxingge/nginx-ingress-controller:0.21.0
sizeBytes: 568075227
- names:
- tomcat@sha256:996d406c509a4ebe2f4e96eeda331a354f1663b7ec0ff06685b75c4decef7325
- tomcat:8.5.49-jdk8-openjdk
sizeBytes: 506825090
- names:
- wordpress@sha256:3f0f305010ba73b4c186597e6a16a4fbcec6f1f653bd52132fbc7d2da7e93592
- registry.cn-hangzhou.aliyuncs.com/wuxingge/wordpress@sha256:e164642b86898cc9b6f24a2ed805d1c133b1cc40e078805273221b0e73a3cdc2
- wordpress:latest
- registry.cn-hangzhou.aliyuncs.com/wuxingge/wordpress:latest
sizeBytes: 446902681
- names:
- wordpress@sha256:6216f64ab88fc51d311e38c7f69ca3f9aaba621492b4f1fa93ddf63093768845
- registry.cn-hangzhou.aliyuncs.com/wuxingge/wordpress@sha256:b40c224a95cd51d5af4d23a87bd2971805b858a35e03a24797f75b25b473d0a7
- wordpress:4.8-apache
- registry.cn-hangzhou.aliyuncs.com/wuxingge/wordpress:4.8-apache
sizeBytes: 408405324
- names:
- 10.0.0.11:5000/mysql@sha256:78ff5aa8e65b9e99775f24044591533ae5c35a1ae0264e73747b56215f0489ef
- mysql@sha256:196fe3e00d68b2417a8cf13482bdab1fcc2b32cf7c7575d0906c700688b352b4
- registry.cn-hangzhou.aliyuncs.com/wuxingge/mysql@sha256:78ff5aa8e65b9e99775f24044591533ae5c35a1ae0264e73747b56215f0489ef
- 10.0.0.11:5000/mysql:5.7
- mysql:5.7
sizeBytes: 372901903
- names:
- calico/node@sha256:0a16ddf391c06e065c5b4db75069da9e153f9fc9dd45f92ff64a55616e0bfe26
- calico/node:v3.10.2
sizeBytes: 192217400
- names:
- calico/cni@sha256:7e7a7ecdb6c14342cc7e1dd231df7f261419dee79c012f031c8c66521b801714
- calico/cni:v3.10.2
sizeBytes: 163310513
- names:
- registry.cn-beijing.aliyuncs.com/minminmsn/kubernetes-dashboard@sha256:600ca1701185dd32ca9f938448f92eb277cf418e38f1bbe6eced5eebf7d55f44
- registry.cn-hangzhou.aliyuncs.com/wuxingge/kubernetes-dashboard@sha256:600ca1701185dd32ca9f938448f92eb277cf418e38f1bbe6eced5eebf7d55f44
- registry.cn-beijing.aliyuncs.com/minminmsn/kubernetes-dashboard:v1.10.1
- registry.cn-hangzhou.aliyuncs.com/wuxingge/kubernetes-dashboard:v1.10.1
sizeBytes: 121711221
- names:
- 10.0.0.11:5000/wuxingge/nginx@sha256:e770165fef9e36b990882a4083d8ccf5e29e469a8609bb6b2e3b47d9510e2c8d
- 10.0.0.11:5000/wuxingge/nginx:latest
sizeBytes: 109331233
- names:
- tomcat@sha256:5c4d935028f24be36f5d9fa7ca4489aa1993ee51b08cd41d96ad99c79a25354d
- tomcat:8.0.50-jre8-alpine
sizeBytes: 105997684
- names:
- quay.io/coreos/flannel@sha256:3fa662e491a5e797c789afbd6d5694bdd186111beb7b5c9d66655448a7d3ae37
- quay.io/coreos/flannel@sha256:7806805c93b20a168d0bbbd25c6a213f00ac58a511c47e8fa6409543528a204e
- quay.io/coreos/flannel:v0.11.0
- quay.io/coreos/flannel:v0.11.0-amd64
sizeBytes: 52567296
- names:
- myhub.fdccloud.com/library/kubedns-amd64@sha256:1b21c69cd89b9bb47879ef94f03be2b0db194c7c04af4faa781cdd47474b88ec
- registry.cn-hangzhou.aliyuncs.com/wuxingge/kubedns-amd64@sha256:df5392e5c76d8519301d1d2ee582453fd9185572bcd44dc0da466b9ab220c985
- myhub.fdccloud.com/library/kubedns-amd64:1.9
- registry.cn-hangzhou.aliyuncs.com/wuxingge/kubedns-amd64:1.9
sizeBytes: 46998769
- names:
- coredns/coredns@sha256:e83beb5e43f8513fa735e77ffc5859640baea30a882a11cc75c4c3244a737d3c
- registry.cn-hangzhou.aliyuncs.com/wuxingge/coredns@sha256:c0cd10c02849eace57ecbf5b724dc0c0cc7299cbde7630c5a7d5cc01c05ff38f
- coredns/coredns:1.5.0
- registry.cn-hangzhou.aliyuncs.com/wuxingge/coredns:1.5.0
sizeBytes: 42488424
- names:
- infoblox/dnstools@sha256:76e9d6514bbbe64b8a679218568a50ec5aef599c587980f7e30e4f9efd80ebe6
- registry.cn-hangzhou.aliyuncs.com/wuxingge/dnstools@sha256:76e9d6514bbbe64b8a679218568a50ec5aef599c587980f7e30e4f9efd80ebe6
- infoblox/dnstools:latest
- registry.cn-hangzhou.aliyuncs.com/wuxingge/dnstools:latest
sizeBytes: 15754657
- names:
- ikubernetes/myapp@sha256:9c3dc30b5219788b2b8a4b065f548b922a34479577befb54b03330999d30d513
- ikubernetes/myapp:v1
sizeBytes: 15504557
- names:
- ikubernetes/myapp@sha256:85a2b81a62f09a414ea33b74fb8aa686ed9b168294b26b4c819df0be0712d358
- registry.cn-hangzhou.aliyuncs.com/wuxingge/myapp@sha256:5f4afc8302ade316fc47c99ee1d41f8ba94dbe7e3e7747dd87215a15429b9102
- ikubernetes/myapp:v2
- registry.cn-hangzhou.aliyuncs.com/wuxingge/myapp:v2
sizeBytes: 15504059
- names:
- myhub.fdccloud.com/library/dnsmasq-metrics-amd64@sha256:a1bfd78d01254c75cc14ecfad6568a9d76425506a5bc17c1a39c906c219c2f13
- registry.cn-hangzhou.aliyuncs.com/wuxingge/dnsmasq-metrics-amd64@sha256:4767af0aee3355cdac7abfe0d7ac1432492f088e7fcf07128d38cd2f23268638
- myhub.fdccloud.com/library/dnsmasq-metrics-amd64:1.0
- registry.cn-hangzhou.aliyuncs.com/wuxingge/dnsmasq-metrics-amd64:1.0
sizeBytes: 14002508
- names:
- calico/pod2daemon-flexvol@sha256:c99e3e20083902d79fb3aeb5c7cb634c9b32640ca00d38da452e858a7717a80d
- calico/pod2daemon-flexvol:v3.10.2
sizeBytes: 9780495
- names:
- myhub.fdccloud.com/library/exechealthz-amd64@sha256:b54c3595f9a8b38c8e8b84ce1721c969fdc2b1a1fb3c51ee18d4ecbf692a0cdb
- registry.cn-hangzhou.aliyuncs.com/wuxingge/exechealthz-amd64@sha256:34722333f0cd0b891b61c9e0efa31913f22157e341a3aabb79967305d4e78260
- myhub.fdccloud.com/library/exechealthz-amd64:1.2
- registry.cn-hangzhou.aliyuncs.com/wuxingge/exechealthz-amd64:1.2
sizeBytes: 8374840
- names:
- myhub.fdccloud.com/library/kube-dnsmasq-amd64@sha256:d68dc5377bbd81322dfdddc593bc45e3a2e042b93b5e2066bfb179747da74f1e
- registry.cn-hangzhou.aliyuncs.com/wuxingge/kube-dnsmasq-amd64@sha256:f7590551a628ec30ab47f188040f51c61e53e969a19ef44846754ba376b4ce21
- myhub.fdccloud.com/library/kube-dnsmasq-amd64:1.4
- registry.cn-hangzhou.aliyuncs.com/wuxingge/kube-dnsmasq-amd64:1.4
sizeBytes: 5129740
- names:
- busybox@sha256:1303dbf110c57f3edf68d9f5a16c082ec06c4cf7604831669faf2c712260b5a0
- busybox:latest
sizeBytes: 1219790
- names:
- busybox@sha256:fe301db49df08c384001ed752dff6d52b4305a73a7f608f21528048e8a08b51e
sizeBytes: 1219782
- names:
- busybox@sha256:4b6ad3a68d34da29bf7c8ccb5d355ba8b4babcad1f99798204e7abb43e54ee3d
sizeBytes: 1199418
- names:
- busybox@sha256:bbc3a03235220b170ba48a157dd097dd1379299370e1ed99ce976df0355d24f0
- busybox:1.27
sizeBytes: 1129289
- names:
- myhub.fdccloud.com/library/busybox@sha256:d5f1852918ae605ca9d6e93a7f4415cf045a584ea0cba42e79dce09f24e6ad0f
- registry.cn-hangzhou.aliyuncs.com/wuxingge/busybox@sha256:3815f18c767695a15c95545b90b95cef0c7444868b25040176ef9ed13d4cdc6b
- myhub.fdccloud.com/library/busybox:latest
- registry.cn-hangzhou.aliyuncs.com/wuxingge/busybox:latest
sizeBytes: 1092588
- names:
- 10.0.0.11:5000/wuxingge/pause-amd64@sha256:f04288efc7e65a84be74d4fc63e235ac3c6c603cf832e442e0bd3f240b10a91b
- registry.cn-hangzhou.aliyuncs.com/wuxingge/pause-amd64@sha256:f04288efc7e65a84be74d4fc63e235ac3c6c603cf832e442e0bd3f240b10a91b
- 10.0.0.11:5000/wuxingge/pause-amd64:3.0
- registry.cn-hangzhou.aliyuncs.com/wuxingge/pause-amd64:3.0
sizeBytes: 746888
nodeInfo:
architecture: amd64
bootID: 19f00bda-3d82-432b-ad8f-0e2aca3aaaa2
containerRuntimeVersion: docker://18.9.6
kernelVersion: 3.10.0-862.el7.x86_64
kubeProxyVersion: v1.13.1
kubeletVersion: v1.13.1
machineID: 460b85a3efe84a359f8da49aa3262a63
operatingSystem: linux
osImage: CentOS Linux 7 (Core)
systemUUID: D2514D56-4929-7C06-E1F8-51EC98D5EFAE
设置污点
kubectl taint node 10.0.0.12 node-type=production:NoSchedule
去除污点
kubectl taint node 10.0.0.12 node-type-
查看node信息
kubectl describe nodes 10.0.0.12
新添加taint,驱逐pod
kubectl taint node 10.0.0.13 node-type=dev:NoExecute
pod容忍污点(tolerations)
deploy-demo.yaml (NoExecute)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
ports:
- name: http
containerPort: 80
tolerations:
- key: "node-type"
operator: "Equal"
value: "production"
effect: "NoExecute"
tolerationSeconds: 3600
deploy-demo.yaml (NoSchedule)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
ports:
- name: http
containerPort: 80
tolerations:
- key: "node-type"
operator: "Equal"
value: "production"
effect: "NoSchedule"
deploy-demo.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
ports:
- name: http
containerPort: 80
tolerations:
- key: "node-type"
operator: "Exists"
value: ""
effect: "NoSchedule"
deploy-demo.yaml(所有污点都容忍)
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deploy
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: myapp
release: canary
template:
metadata:
labels:
app: myapp
release: canary
spec:
containers:
- name: myapp
image: ikubernetes/myapp:v1
ports:
- name: http
containerPort: 80
tolerations:
- key: "node-type"
operator: "Exists"
value: ""
effect: ""