Debug Pods

This guide is to help users debug applications that are deployed into Kubernetesand not behaving correctly. This isnot a guide for people who want to debug their cluster.For that you should check outthis guide.

Diagnosing the problem

The first step in troubleshooting is triage. What is the problem?Is it your Pods, your Replication Controller or your Service?

Debugging Pods

The first step in debugging a Pod is taking a look at it. Check the currentstate of the Pod and recent events with the following command:

kubectl describe pods${POD_NAME}

Look at the state of the containers in the pod. Are they allRunning?Have there been recent restarts?

Continue debugging depending on the state of the pods.

My pod stays pending

If a Pod is stuck inPending it means that it can not be scheduled onto a node.Generally this is because there are insufficient resources of one type or anotherthat prevent scheduling. Look at the output of thekubectl describe ... command above.There should be messages from the scheduler about why it can not schedule your pod.Reasons include:

  • You don't have enough resources: You may have exhausted the supply of CPUor Memory in your cluster, in this case you need to delete Pods, adjust resourcerequests, or add new nodes to your cluster. SeeCompute Resources documentfor more information.

  • You are usinghostPort: When you bind a Pod to ahostPort there are alimited number of places that pod can be scheduled. In most cases,hostPortis unnecessary, try using a Service object to expose your Pod. If you do requirehostPort then you can only schedule as many Pods as there are nodes in your Kubernetes cluster.

My pod stays waiting

If a Pod is stuck in theWaiting state, then it has been scheduled to a worker node,but it can't run on that machine. Again, the information fromkubectl describe ...should be informative. The most common cause ofWaiting pods is a failure to pull the image.There are three things to check:

  • Make sure that you have the name of the image correct.
  • Have you pushed the image to the registry?
  • Try to manually pull the image to see if the image can be pulled. For example,if you use Docker on your PC, rundocker pull <image>.

My pod stays terminating

If a Pod is stuck in theTerminating state, it means that a deletion has beenissued for the Pod, but the control plane is unable to delete the Pod object.

This typically happens if the Pod has afinalizerand there is anadmission webhookinstalled in the cluster that prevents the control plane from removing thefinalizer.

To identify this scenario, check if your cluster has anyValidatingWebhookConfiguration or MutatingWebhookConfiguration that targetUPDATE operations forpods resources.

If the webhook is provided by a third-party:

  • Make sure you are using the latest version.
  • Disable the webhook forUPDATE operations.
  • Report an issue with the corresponding provider.

If you are the author of the webhook:

  • For a mutating webhook, make sure it never changes immutable fields onUPDATE operations. For example, changes to containers are usually not allowed.
  • For a validating webhook, make sure that your validation policies only applyto new changes. In other words, you should allow Pods with existing violationsto pass validation. This allows Pods that were created before the validatingwebhook was installed to continue running.

My pod is crashing or otherwise unhealthy

Once your pod has been scheduled, the methods described inDebug Running Podsare available for debugging.

My pod is running but not doing what I told it to do

If your pod is not behaving as you expected, it may be that there was an error in yourpod description (e.g.mypod.yaml file on your local machine), and that the errorwas silently ignored when you created the pod. Often a section of the pod descriptionis nested incorrectly, or a key name is typed incorrectly, and so the key is ignored.For example, if you misspelledcommand ascommnd then the pod will be created butwill not use the command line you intended it to use.

The first thing to do is to delete your pod and try creating it again with the--validate option.For example, runkubectl apply --validate -f mypod.yaml.If you misspelledcommand ascommnd then will give an error like this:

I0805 10:43:25.12985046757 schema.go:126] unknown field: commndI0805 10:43:25.12997346757 schema.go:129] this may be afalse alarm, see https://github.com/kubernetes/kubernetes/issues/6842pods/mypod

The next thing to check is whether the pod on the apiservermatches the pod you meant to create (e.g. in a yaml file on your local machine).For example, runkubectl get pods/mypod -o yaml > mypod-on-apiserver.yaml and thenmanually compare the original pod description,mypod.yaml with the one you gotback from apiserver,mypod-on-apiserver.yaml. There will typically be somelines on the "apiserver" version that are not on the original version. This isexpected. However, if there are lines on the original that are not on the apiserverversion, then this may indicate a problem with your pod spec.

Debugging Replication Controllers

Replication controllers are fairly straightforward. They can either create Pods or they can't.If they can't create pods, then please refer to theinstructions above to debug your pods.

You can also usekubectl describe rc ${CONTROLLER_NAME} to introspect eventsrelated to the replication controller.

Debugging Services

Services provide load balancing across a set of pods. There are several common problems that can make Servicesnot work properly. The following instructions should help debug Service problems.

First, verify that there are endpoints for the service. For every Service object,the apiserver makes one or moreEndpointSlice resources available.

You can view these resources with:

kubectl get endpointslices -l kubernetes.io/service-name=${SERVICE_NAME}

Make sure that the endpoints in the EndpointSlices match up with the number of pods that you expect to be members of your service.For example, if your Service is for an nginx container with 3 replicas, you would expect to see three differentIP addresses in the Service's endpoint slices.

My service is missing endpoints

If you are missing endpoints, try listing pods using the labels that Service uses.Imagine that you have a Service where the labels are:

...spec:-selector:name:nginxtype:frontend

You can use:

kubectl get pods --selector=name=nginx,type=frontend

to list pods that match this selector. Verify that the list matches the Pods that you expect to provide your Service.Verify that the pod'scontainerPort matches up with the Service'stargetPort

Network traffic is not forwarded

Please seedebugging service for more information.

What's next

If none of the above solves your problem, follow the instructions inDebugging Service documentto make sure that yourService is running, hasEndpoints, and yourPods areactually serving; you have DNS working, iptables rules installed, and kube-proxydoes not seem to be misbehaving.

You may also visittroubleshooting document for more information.

Last modified April 09, 2025 at 5:08 AM PST:Update docs for deprecation of Endpoints API (#49831) (649bda2cbd)