k8s the Pod status and lifecycle management

Pod status and life cycle management

 
First, what is Pod?
Second, how to manage Pod multiple containers?
Third, the use Pod
Fourth, persistence and termination of Pod
Five, Pause container
Six, init container
Seven, Pod life cycle
(1) Pod phase (phase of Pod)
Creation process (2) Pod of
(3) Pod state
(4) Pod viability probe
(5) livenessProbe usage scenarios and readinessProbe
(6) Pod restart strategy
(7) Pod life
(8) livenessProbe resolve
 
 
 
First, what is Pod?
Pod is kubernetes you can create and deploy the smallest and most simple units. A Pod represents a process running in the cluster.
 
Pod encapsulated in a container application (in some cases is several containers), storage, independent network IP, how to manage the operation of container policy options. Pod represents a unit of deployment: kubernetes applied in one example, or may be a combination of a plurality of containers together to share resources.
 
In Kubrenetes cluster Pod following two ways:
Pod run a container. Mode "Each Pod in a container" is the most common usage; In this use, you can imagine the Pod is packaged in a single container, kuberentes Pod management is not directly manage the container.
Simultaneously running a plurality of containers in a Pod. Pod in a package may be simultaneously tightly coupled to Some interoperable containers, shared resources among them. These containers in the same Pod can cooperate with each other to become a service unit - a container file sharing, another "sidecar" container to update these files. Pod storage resources such as a container managed entity.
 
Pod in a shared environment, including Linux's namespace, cgroup and possibly other isolated environment, which is consistent with Docker containers. Pod in the environment, each container there may be a smaller sub-isolated environment.
Pod in the container shared IP address and port number can be found by each localhost therebetween. Between them by inter-process communication, to be understood that the container is under the same communicate via a Pod lo card. E.g. SystemV signal or POSIX shared memory. Pod between different containers have different IP addresses, it can not directly communicate through the IPC.
 
Pod in the container also have access rights to share the volume, which volume is defined as a part of a pod mounted to the application and file system container.
Like every application container, pod is considered to be a temporary entity. In the Pod's life cycle, pod after being created, it is assigned a unique ID (UID), scheduled on the node, and agreed to maintain the desired state until the end (according to reboot strategy) or deleted. If the node is dead, assigned to the pod on this node, after a timeout will be rescheduled to another node node. A given pod (as defined UID) will not be "rescheduled" to the new node, but was replaced by a similar pod, if desired, it may even be the same name, but there will be a new UID (see replication controller for details).
 
Second, how to manage Pod multiple containers?
Pod can run multiple processes (running as a container) to work together. Pod in the same container will be automatically assigned to the same node. Pod in a container with a shared resource, and rely on the network environment, they are always scheduled simultaneously.
 
Note that run simultaneously in multiple containers in a Pod is a more advanced uses. Only when you need to close the container with the collaboration of time before considering using this model. For example, you have a container as a web server to run, need to use a shared volume, there is another "sidecar" to get the container resource update these files remotely,
 
Pod are shared with two resources: network and storage.
The internet:
  Each Pod is assigned a unique IP address. Pod all containers in a shared network drive, including IP address and port. Pod inside the container may communicate with each other using localhost. Pod communication with the outside when the container is to be shared network resource allocation (e.g., using the host port mapping).
 
storage:
  Pod can specify multiple shared Volume. All containers in the Pod can access the shared volume. Volume can also be used for persistent storage resources in the Pod, to prevent the container after the restart file is missing.
 
 
Third, the use Pod
Pod usually divided into two categories:
 
Autonomous Pod: Pod This in itself is not self-repair, when the Pod is created (either directly created by you or by another Controller), it will be Kuberentes scheduled on Node cluster. Pod until the process is terminated, be deleted, because of the lack of resources was expelled, or Node failure before the Pod will remain on the Node. Pod will not heal. If the Pod is running Node failure or a failure scheduler itself, the Pod will be deleted. Similarly, if a lack of resources or Node where the Pod Pod in maintenance state, Pod will be deported.
Management Controller Pod: Kubernetes more advanced abstraction layer called the Controller to manage Pod instance. Controller can create and manage multiple Pod, provide a copy of the management, rolling upgrades and cluster level healing powers. For example, if a Node failure, the Controller will automatically scheduled on other healthy Node Pod on the node. Although you can directly use the Pod, but is usually in Kubernetes use the Controller to manage the Pod. As shown below:
Each Pod has a special called "Root Container" Pause a container. Pause image corresponding to part of the container Kubernetes platform, except Pause containers, each Pod further comprising a user traffic or a plurality of closely related container.
 
Kubernetes Pod concept design of such special composition and structure what purpose?
One reason: In the case of a group of containers as a unit, it is difficult for the entire container easily and efficiently determined action. For example, a container died, and this was considered as a whole hang it? Then introduced into service independent container as a root container Pod Pause to its state represents the state of the entire group of containers, so that the problem can be solved.
Two reasons: Pod in multiple service container vessels shared Pause IP, shared Pause container mounted Volume, which simplifies the communication problems between the container business, but also solve the problem of file sharing between the containers.
 
 
Fourth, persistence and termination of Pod
Persistent (1) Pod of
Pod design support is not as persistent entities. In scheduling failure, a node failure, a lack of resources will be the next node maintains state or dead would be expelled.
Typically, users do not need to manually create Pod directly, but should use Controller (e.g. the Deployments), even in the case of creating a single Pod. Controller cluster level can provide self-healing capabilities, replication, and upgrade management.
 
(2) Pod termination
Because Pod as a process running on a node of the cluster, so when you no longer need to terminate gracefully off is necessary (way than to send KILL signal using such violence). Users need to be able to relax a delete request, and they know when they will be terminated, have been properly removed. Delete pod user wants to send a request to terminate the program, there will be a grace period before the pod can be forcibly removed TERM sends a request to the main process of each container. Once the timeout, the main process will send a KILL signal and remove it from the API server. If kubelet or container manager while waiting for the process to terminate the restart, after restart will still be retried complete grace.
 
Example process is as follows:
 
The user sends a command to delete the pod, the default grace period is 30 seconds;
After the grace period Pod than Pod API server will update the status of "dead";
Pod status display on the client command line is "terminating";
While at the same time with the third step, when kubelet found pod is marked as "terminating" state, stop the pod to start the process:
If you define preStop hook in the pod, it will be called before stopping pod. If the grace period, preStop hook still in the running, the second step will be to add two seconds of grace;
Pod to the TERM signal in the transmission process;
Meanwhile, the list of endpoints Pod deleted from the service with the third step, is no longer part of replication controller. Off slow pod will continue to handle the traffic load balancer forwards;
After a grace period, the process will still run in the Pod send SIGKILL signal to kill the process.
Kublete Pod will complete the removal of the API server, by elegant period is set to 0 (immediately deleted). Pod disappeared in the API, and is not visible in the client.
Delete the default grace period is 30 seconds. kubectl delete commands support -grace-period = <seconds> option allows users to set their own grace. If set to 0 will force the removal pod. In kubectl> = 1.5 version of the command, you must use the --force and --grace-period = 0 to force delete pod.
 
Pod is forcibly deleted by its definition in the state of the cluster and etcd for deletion. When the forced delete command, API server does not wait for the pod running on the node kubelet confirmed, it will be removed from the pod immediately the API server, then you can create a pod with the same name as the original pod. At this time, pod on the node will be set immediately terminating state, but still before being forced to delete a short period elegance deleted.
 
 
Five, Pause container
Pause vessel, known as Infra container. We checked node node will find the time on each node runs a lot of pause container
kubernetes pause in each container service container provides the following functions mainly:
 
Linux as a base namespace share in the pod;
Enable pid namespace, open the init process.
 
Resolve
pause to map the interior of the container port 80 to host the 8880 port, after the pause containers set up on the host network namespace, nginx vessel added to the network namespace, we see nginx container start time specified --net = container: pause, ghost same container was added to the namespace of the network, so that the three containers on a shared network, can communicate directly with each other using localhost, - ipc = contianer: pause --pid = container: pause is three containers in the same namespace in, init process is a pause, then we enter into the ghost container to see the process situation.
 
At the same time you can see the process pause and nginx container in ghost container and the container is 1 PID pause. In kubernetes in PID process container = 1 is the business process container itself.
 
 
Six, init container
Pod can have a plurality of containers, applications running inside the container, but it may also have one or more applications prior to the start of the Init container vessel.
Init vessel much like an ordinary container, except the following two points:
Init container always run to successfully complete.
Init Each container must be successfully completed before the next start Init container.
If the Pod's Init vessel failure, Kubernetes will continue to restart the Pod, Init container until it succeeds. However, if the corresponding restartPolicy Pod is Never, it will not restart.
 
Initialization container, container starts when the name suggests, will first start can be one or more containers, if there are multiple, so these Init Container sequentially executed in the order defined only after the implementation of all the Init Container, will be the main container start up. Since the Pod in a shared storage volume, the data produced in the Init Container primary containers may be used to. Init Container can be used to such Deployment, Daemon Set, Pet Set, Job, etc., but in the end are, in the main container before starting to perform at the Pod start, do the initial work on a variety of resources in K8S.
Scenario:       
 
The first scenario: waiting for the other modules Ready, for example, we have an application which has two container service, is a Web Server, another database. Which Web Server requires access to the database. But when we start this application does not guarantee that the database service first started up, it may appear some time Web Server has a database connection error. To solve this problem, Pod we can run the Web Server service's use a InitContainer, whether to check the database ready until the database can be connected, Init Container before the end of the exit, then Web Server container is activated to initiate a formal database connection requests . The second scenario: the initial configuration, such as detection of all members of the cluster nodes already exist, mainly container ready for cluster configuration data, so the main container after it can use this information to configure the cluster. Other use scenarios: pod As will register to a central database, the application download dependence.
 
 
Seven, Pod life cycle
(1) Pod phase (phase of Pod)
Pod of the status information stored in PodStatus defined, which has a phase field.
Pod phase (Phase) is in its life cycle Pod simple macro overview. The stage is not a comprehensive summary of the container or Pod, nor as to the comprehensive state machine.
Pod number of phases and strict meaning specified. In addition to this document include a state, there should no longer assume that other phase values ​​Pod.
 
The following is a phase possible values:
Suspend (Pending): API Server resource object created Pod and has been stored in the etcd, but it has not been scheduled, or is still in downloading the image from the warehouse process.
Operation (Running): Pod has been scheduled onto a node, and all containers have been kubelet created.
Success (Succeeded): Pod in all containers have been successfully terminated, and will not be restarted.
Failed (Failed): Pod in all containers have been terminated, and at least one container is terminated because of failure. That is, the container exit 0 state or in a non-terminated systems.
Unknown (Unknown): For some reason unable to obtain state Pod, usually because failure lies with the host communication Pod.
 
Creation process (2) Pod of
Pod is Kubernetes the base unit, to understand the process of its creation, but also help to understand the operation of the system.
① Pod Spec user submitted to the API Server by kubectl or other API clients.
②API Server attempts to store information objects to etcd Pod and wait for the write operation is completed, API Server returns an acknowledgment message to the client.
③API Server began to reflect the change of state of etcd.
④ All Kubernetes follow-up examination components related information on changes in the API Server by the "watch" mechanism.
⑤kube-scheduler (scheduler) detected by its "watcher" to create a new API Server Pod objects but not bound to any worker nodes.
⑥kube-scheduler to pick a job Pod node object and result updates to API Server.
⑦ news updated by the scheduling result to the API Server etcd, and also starting the feedback API Server scheduling result of the object Pod.
⑧Pod kubelet is scheduled to work on the target node attempts to call docker engine to boot container on the current node, and returns the results to the state of the container API Server.
⑨API Server information is stored into the Pod etcd system.
⑩ write operations are completed in etcd confirmed, API Server sends an acknowledgment message to the relevant kubelet.
 
(3) Pod state
Pod has a PodStatus object containing a PodCondition array. PodCondition each element of the array has a type field and a status field. type field is a string, the possible values ​​are PodScheduled, Ready, Initialized and Unschedulable. The status field is a string, the possible values ​​are True, False, and Unknown.
 
(4) Pod viability probe
In the pod lifecycle we can do something about it. The main container can be completed before the start of the initialization container, the container can have multiple initialization, they are serial execution, execution is completed after the launch, when the main program just started can specify a post start main program starts after beginning to perform some operations before the end of the main program you can specify a pre stop represents some operations performed before the end of the main program. After starting the program can do two types of detection liveness probe (probe viability) and readness probe (readiness probe).
 
Kubelet periodic diagnosis probe is performed on the container. To perform diagnostics, kubelet call Handler implemented by the container. The methods to detect viability of the following three:
ExecAction: Executes the specified command in the container. If the exit command return code of 0 is considered diagnostic success.
TCPSocketAction: the specified container port TCP IP address checking. If the port is open, the diagnosis is considered to be a success.
HTTPGetAction: performing HTTP Get request for an IP address of the container on the specified ports and paths. If the response status code greater than or equal to 200 and less than 400, then the diagnosis is considered successful.
 
The above resource manifest file, a connection request to 80 / tcp port originating the IP Pod and viable state is determined according to a state Pod connection establishment.
Each probe will get one of three results:
Success: container through the diagnosis.
Failure: container is not diagnostic.
Unknown: diagnostics fail, it will not take any action.
Kubelet two implementations can choose whether to run the probe on the container and to react:
 
livenessProbe: indicate whether the container is running. If the survival of the probe fails, kubelet will kill container and the container will be affected by the restart of its strategy. If the container does not provide a viable probe, the default status of Success.
readinessProbe: indicate whether the container is ready to service requests. If the probe fails ready, the endpoint controller will match the Pod Service endpoint of all remove the IP address of the Pod. Ready state before the initial delay default Failure. If the container is not ready to provide the probe, the default status of Success.
 
(5) livenessProbe usage scenarios and readinessProbe
If the container process to crash on its own in the event you experience difficulties or unhealthy, you do not necessarily need to survive probe; kubelet will automatically do the right thing according to the restartPolicy Pod.
If you want the container to be killed in the detection failure and restart, specify a survival probe, and specify restartPolicy to Always or OnFailure.
If you want to start sending traffic to Pod only when the probe succeeds, specify ready probe. In this case, the probe may be the same readiness and survival of the probe, but there ready spec of the probe means Pod will start without receiving any traffic, and only begin after probing success receive traffic.
 
If you want to be able to maintain their own container, you can specify a ready probe, the probe different inspection probe and survival endpoints.
Please note that if you want to be removed when the Pod can request exclusion, you do not necessarily need to use ready probe; when you delete Pod, Pod will automatically be placed in their unfinished state, regardless of whether there is a ready probe. When the container is stopped in the waiting Pod, Pod remain in an incomplete state.
 
(6) Pod restart strategy
PodSpec has a restartPolicy field, possible values ​​Always, OnFailure and Never. The default is Always. Pod restartPolicy suitable for all containers. restartPolicy by kubelet refers only to restart on the same node container. Failure of the vessel by the kubelet to five minutes for the upper limit of the exponential backoff delay (10 seconds, 20 seconds, 40 seconds ...) restart and reset after the successful implementation of ten minutes. Once a node is bound to the pod, Pod will never again be bound to another node.
 
(7) Pod life
In general, Pod will not go away until people destroy them. It could be a person or a controller. phase over a period of time (determined by the master) the Pod The only exception to this rule is the success or failure will expire and be automatically destroyed.
 
The controller has three available:
 
Use Job Run is expected to terminate the Pod, such as bulk calculations. Job applies only to policies for OnFailure or Never restart the Pod.
Use ReplicationController, ReplicaSet and Deployment Pod is not expected to terminate, such as Web servers. ReplicationController is only applicable to restartPolicy Always the Pod.
Provide machine-specific system services, use DaemonSet run a Pod for each machine.
All three types of controllers include a PodTemplate. Recommended to create the appropriate controller, allowing them to create Pod, rather than directly create Pod own. This is because there is no way to separate the Pod in case of failure of the automatic restoration machine, and the controller can.
 
If the node death or disconnected from the rest of the cluster, the Kubernetes will apply a policy will lose all Pod setting on the phase node Failed.
 
(8) livenessProbe resolve
[root@k8s-master ~]# kubectl explain pod.spec.containers.livenessProbe
KIND: Pod
VERSION: v1
RESOURCE: livenessProbe <Object>
exec command such as a way to detect process ps
failureThreshold detecting several failed three times in a row is considered a failure by default
How long probe periodSeconds once every 10s by default
The default number of seconds timeoutSeconds probe supermarket 1s
initialDelaySeconds initialization delay detection, detection of the first time since the completion of the main program may not start
Detecting the detection port tcpSocket
httpGet http request detection
 
For example: the definition of liveness of a pod type of resource, the base image is busybox, after busybox this vessel will start the process of creating / tmp / test file ah, and delete, and then wait for 3600 seconds. Then define viability detection mode is performed in a manner exec command determination / tmp / test exists, indicates the presence of viable i.e., it indicates the absence of containers have been hung up.
 
 
(9) resource needs and resource constraints
In the context of Docker, we know that can be a resource request or consumed on the operation of container limit. In Kubernetes, but also have the same mechanism, the container or may apply Pod and is consumed by CPU and memory resources. CPU resources belong compressible type, that is, the amount of resources can be contracted on demand. The memory belongs to the incompressible type resources, shrink memory may cause unforeseen problems.
 
Isolation of resources currently belongs level container configuration requires CPU and memory resources in the container under Pod spec defined fields. Specific fields can be used "requests" are defined to ensure that the requested amount of resources. That container may run less than this amount of resources, but must ensure that there are so many resources supply. And "limits" is a maximum limit of available resources, belonging to the hard limit.
 
In Kubernetes, the CPU 1 unit corresponds to a virtual machine virtual CPU (vCPUs) or hyper-threading on a physical machine CPU, which supports fraction measurement method, a core (1core) corresponding to the micro core 1000 (millicores ), thus corresponding to 0.5 500m core, i.e., one-half core. Memory is the same measurement method, default in bytes, may also be used E, P, T, G, M and K as a suffix unit, or Ei, Pi, Ti, Gi, Mi, Ki and other forms suffix units .
 
Resource requirements, for example:
 
apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
spec:
  containers:
  - name: nginx
    image: nginx
    resources:
      requests:
        memory: "128Mi"
        cpu: "200m"
 
Configuration list above, nginx requested CPU resource size is 200m, which means that a CPU core is sufficient for nginx is running the fastest way to where it is desirable for the available memory size 128Mi, will not necessarily use such actual operation more resources. Considering the type of memory resources, the possibility will be killed in the presence of OOM killer run time exceeds the specified size, so the request over the upper limit value belongs memory used.
 
Resource constraints, for example:
 
Container resource needs only to ensure the least amount of resources needed to run the container, but does not limit the upper limit of its available resources. When there Bug application, but also could lead to long-term occupation of system resources, which requires additional attribute limits the maximum amount of container defined resource use. CPU belongs compressible resources that can be freely adjusted. The memory belongs to the hard-limiting resource, when the process for allocation of over limit attribute defines the memory size, the Pod OOM killer will be killed. as follows:
 
[root@k8s-master ~]# vim memleak-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: memleak-pod
  labels:
    app: memleak
spec:
  containers:
  - name: simmemleak
    image: saadali / simmemleak
    resources:
      requests:
        memory: "64Mi"
        cpu: "1"
      limits:
        memory: "64Mi"
        cpu: "1"
 
[root@k8s-master ~]# kubectl apply -f memleak-pod.yaml
pod/memleak-pod created
[root@k8s-master ~]# kubectl get pods -l app=memleak
NAME READY STATUS RESTARTS AGE
memleak-pod 0/1 OOMKilled 2 12s
[root@k8s-master ~]# kubectl get pods -l app=memleak
NAME READY STATUS RESTARTS AGE
memleak under-28s 2 0/1 CrashLoopBackOff
 
Pod default resource restart policy for Always, because of memory limitations in memleak termination will restart immediately, at which point the Pod will be killed OOM killer, repeated many times since the memory resource depletion would trigger the restart delay restart Kunernetes system, each reboot time times will continue to lengthen, to see behind the Pod status is usually "CrashLoopBackOff".
 
There is also need to be clear that, in a Kubernetes cluster, many running Pod, then when the resource nodes are unable to meet multiple objects using the Pod, is in accordance with what order to terminate the Pod objects? ?
 
Kubernetes are themselves unable to judge the need for the aid of the priority object determination Pod Pod termination of priorities. Pod according to requests and object attributes limits, Kubernetes Pod objects into the service quality of three categories:
 
Guaranteed: each container is provided with the same properties for requests and limits CPU and memory resources Pod automatically belong to this category, belonging to the highest priority.
Burstable: at least one container is provided requests the CPU or memory properties, Guaranteed single category does not satisfy the requirements of the resources attributed to the category, a medium priority.
BestEffort: container is provided requests are not any properties and attributes Pod resource limits, automatically attributed to the category, among the lowest levels.
 

Guess you like

Origin www.cnblogs.com/muzinan110/p/11105837.html