Kubernetes (K8s) deployments often present challenges from a variety of perspectives, including pods, services, ingress, unresponsive clusters, control planes, and high-availability setup. Kubernetes pod is the smallest deployable unit in the Kubernetes ecosystem, encapsulating one or more containers that share resources and networks. Pods are designed to run a single instance of an application or process, and are created and disposed of as needed. Pods are essential for scaling, updating, and maintaining applications in a K8s environment.
Translated from Master Kubernetes Pods: Advanced Troubleshooting Strategies , author None.
This article explores the challenges faced by Kubernetes pods and the troubleshooting steps to take. Some of the error messages encountered when running Kubernetes pods include:
- ImagePullBackoff
- ErrImagePull
- InvalidImageName
- CrashLoopBackOff
Sometimes, you won't even encounter the errors listed, but still find that your pod fails. First, it's important to note that when debugging any Kubernetes resource, you should understand the API reference . It explains how the various Kubernetes APIs are defined and how multiple objects in a pod/deployment work. The documentation is clearly defined in the API reference on the Kubernetes website . In this case, when debugging the pod, select the pod object from the API reference to learn more about how the pod works. It defines the fields that go into the pod, namely version, type, metadata, specification, and status. Kubernetes also provides a cheat sheet with a guide to the required commands.
prerequisites
This article assumes that readers have the following conditions:
- Kind installed for scenario demonstration
- Intermediate understanding of Kubernetes architecture
- Kubectl command line tool
Kubernetes Pod Error - ImagePullBackoff
This error appears for three different reasons:
- Invalid image
- Invalid tag
- Invalid permissions
These situations arise when you don't have the correct information about the image. You may also not have permission to pull the image from its repository (private repository). To demonstrate this in the example below, we create an nginx deployment:
➜ ~ kubectl create deploy nginx --image=nginxdeployment.apps/nginx created
After the Pod is running, get the pod name:
➜ ~ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-8f458dc5b-hcrsh 1/1 Running 0 100s
Copy the name of the running pod and get more information about it:
➜ ~ kubectl describe pod nginx-8f458dc5b-hcrsh
Name: nginx-8f458dc5b-hcrsh
hable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m43s default-scheduler Successfully assigned default/nginx-8f458dc5b-hcrsh to k8s-troubleshooting-control-plane
Normal Pulling 2m43s kubelet Pulling image "nginx"
Normal Pulled 100s kubelet Successfully pulled image "nginx" in 1m2.220189835s
Normal Created 100s kubelet Created container nginx
Normal Started 100s kubelet Started container nginx
The image has been successfully pulled. Your Kubernetes pod is running without errors.
To demonstrate ImagePullBackoff, edit the deployment YAML file and specify a non-existent image:
➜ kubectl edit deploy nginx
containers:
-image: nginxdoestexist
imagePullPolicy: Always
name: nginx
The new pod was not successfully deployed
➜ ~ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-5b847fdb95-mx4pq 0/1 ErrImagePull 0 3m40s
nginx-8f458dc5b-hcrsh 1/1 Running 0 38m
ImagePullBackoff error displayed
➜ ~ kubectl describe pod nginx-6f46cbfbcb-c92bl
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 88s default-scheduler Successfully assigned default/nginx-6f46cbfbcb-c92bl to k8s-troubleshooting-control-plane
Normal Pulling 40s (x3 over 88s) kubelet Pulling image "nginxdoesntexist"
Warning Failed 37s (x3 over 85s) kubelet Failed to pull image "nginxdoesntexist": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginxdoesntexist:latest": failed to resolve reference "docker.io/library/nginxdoesntexist:latest": pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed
Warning Failed 37s (x3 over 85s) kubelet Error: ErrImagePull
Normal BackOff 11s (x4 over 85s) kubelet Back-off pulling image "nginxdoesntexist"
Warning Failed 11s (x4 over 85s) kubelet Error: ImagePullBackOff
Kubernetes Pod Error - The image has been pulled but the Pod is in pending status.
Whenever you run K8s in a production environment, the K8s administrator allocates resource quotas to each namespace based on the requirements of the namespaces running within the cluster. Namespaces are used for logical separation within a cluster.
The "Image pulled, but the pod is still pending" error is thrown when the specifications in the resource quota do not meet the minimum requirements of the application in the Pod. In the following example, create a namespace called payments:
➜ ~ kubectl create ns payments
namespace/payments created
Create resource quotas using relevant specifications
➜ ~ cat resourcequota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-resources
spec:
hard:
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 4Gi
Assign resource quotas to namespace payments
➜ ~ kubectl apply -f resourcequota.yaml -n paymentsresourcequota/compute-resources created
Resource quota/compute-resources created
Create a new deployment within a namespace with resource quota restrictions:
kubectl create deploy nginx --image=nginx -n paymentsdeployment.apps/nginx created
Although the deployment was created successfully, no Pods exist:
➜ ~ kubectl get pods -n payments
No resources found in payments namespace
The deployment is created, but there are no Pods in the ready state, no Pods to update, and no Pods available:
➜ ~ kubectl get deploy -n payments
NAME READY UP-TO-DATE AVAILABLE AGE
nginx 0/1 0 0 7m4s
To debug further, describe the nginx deployment. Pod creation failed:
➜ ~ kubectl describe deploy nginx -n payments
Name: nginx
Namespace: payments
CreationTimestamp: Wed, 24 May 2023 21:37:55 +0300
Labels: app=nginx
Annotations: deployment.kubernetes.io/revision: 1
Selector: app=nginx
Replicas: 1 desired | 0 updated | 0 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available False MinimumReplicasUnavailable
ReplicaFailure True FailedCreate
Progressing False ProgressDeadlineExceeded
OldReplicaSets: <none>
NewReplicaSet: nginx-8f458dc5b (0/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 10m deployment-controller Scaled up replica set nginx-8f458dc5b to 1
Further analysis from Kubernetes events revealed insufficient memory required for Pod creation.
➜ ~ kubectl get events --sort-by=/metadata.creationTimestamp
This error occurs when your image is successfully pulled and your container is created, but your runtime configuration fails. For example, if you have a working Python application that is trying to write to a folder that does not exist or does not have permission to write to the folder. Initially, the application executes and then encounters an error. If a panic occurs in your application logic, the container will stop. The container will go into CrashLoopBackOff. Eventually, you observe that the deployment has no Pods, i.e. there is a Pod, but it is not running and throws a CrashLoopbackoff error.
Liveness and readiness probes failed
Liveness detection detects whether a Pod has entered a damaged state and can no longer provide traffic. Kubernetes will restart the Pod for you. Readiness probes check whether your application is ready to handle traffic. Readiness probes ensure that your application fetches all required configuration from the configuration map and starts its threads. Only after completing this process will your application be ready to receive traffic. If your application encounters an error during this process, it will also enter CrashLoopBackoff.
Start troubleshooting!
This article provides an overview of troubleshooting techniques for Kubernetes Pods. It addresses common errors encountered when deploying Pods and provides practical solutions for resolving these errors. It also provides insight into reference pages and cheat sheets that are critical when understanding how Kubernetes works and effectively identifying and resolving issues. By following the guidance provided in this article, readers can improve their troubleshooting skills and simplify the deployment and management of their Kubernetes Pods.
I decided to give up on open source Hongmeng. Wang Chenglu, the father of open source Hongmeng: Open source Hongmeng is the only architectural innovation industrial software event in the field of basic software in China - OGG 1.0 is released, Huawei contributes all source code Google Reader is killed by the "code shit mountain" Ubuntu 24.04 LTS is officially released Before the official release of Fedora Linux 40, Microsoft developers: Windows 11 performance is "ridiculously bad", Ma Huateng and Zhou Hongyi shake hands, "eliminating grudges" Well-known game companies have issued new regulations: employee wedding gifts must not exceed 100,000 yuan Pinduoduo was sentenced for unfair competition Compensation of 5 million yuanThis article was first published on Yunyunzhongsheng ( https://yylives.cc/ ), everyone is welcome to visit.