Collect and view control plane metrics
This page describes how to configure a Google Kubernetes Engine (GKE) cluster tosend metrics emitted by the Kubernetes API server, Scheduler, and ControllerManager to Cloud Monitoring using Google Cloud Managed Service for Prometheus. This pagealso describes how these metrics are formatted when they are written toMonitoring, and how to query metrics.
Before you begin
Before you start, make sure that you have performed the following tasks:
- Enable the Google Kubernetes Engine API. Enable Google Kubernetes Engine API
- If you want to use the Google Cloud CLI for this task,install and theninitialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running the
gcloud components updatecommand. Earlier gcloud CLI versions might not support running the commands in this document.Note: For existing gcloud CLI installations, make sure to set thecompute/regionproperty. If you use primarily zonal clusters, set thecompute/zoneinstead. By setting a default location, you can avoid errors in the gcloud CLI like the following:One of [--zone, --region] must be supplied: Please specify location. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.
Requirements
Sending metrics emitted by Kubernetes control plane components toCloud Monitoring has the following requirements:
- The cluster must havesystem metricsenabled.
Configure collection of control plane metrics
You can enable control plane metrics in an existing GKE clusterusing the Google Cloud console, the gcloud CLI or Terraform.
Console
You can enable control plane metrics for a cluster either from theObservability tab for the cluster or fromDetails tab for thecluster. When you use theObservability tab, you can preview theavailable charts and metrics before you enable the metric package.
To enable control plane metrics from theObservability tab for thecluster, do the following:
In the Google Cloud console, go to theKubernetes clusters page:
If you use the search bar to find this page, then select the result whose subheading isKubernetes Engine.
Click your cluster's name and then select theObservabilitytab.
SelectControl Plane from the list of features.
ClickEnable package.
If the control plane metrics are already enabled, then you seea set of charts for control plane metrics instead.
To enable control plane metrics from theDetails tab for the cluster,do the following:
In the Google Cloud console, go to theKubernetes clusters page:
If you use the search bar to find this page, then select the result whose subheading isKubernetes Engine.
Click your cluster's name.
In theFeatures row labelledCloud Monitoring,click theEdit icon.
In theEdit Cloud Monitoring dialog that appears, confirm thatEnable Cloud Monitoring is selected.
In theComponents drop-down menu, select the control plane componentsfrom which you would like to collect metrics:API Server,Scheduler, orController Manager.
ClickOK.
ClickSave Changes.
gcloud
Update your cluster to collect metrics emitted by the Kubernetes API server,Scheduler, and Controller Manager:
gcloudcontainerclustersupdateCLUSTER_NAME\--location=COMPUTE_LOCATION\--monitoring=SYSTEM,API_SERVER,SCHEDULER,CONTROLLER_MANAGERReplace the following:
CLUSTER_NAME: the name of the cluster.COMPUTE_LOCATION: theCompute Engine locationof the cluster.
Terraform
To configure the collection of Kubernetes control plane metrics by using Terraform,see themonitoring_config block in the Terraform registry forgoogle_container_cluster.For general information about using Google Cloud with Terraform, seeTerraform with Google Cloud.
Quota
Control plane metrics consume the "Time series ingestion requests per minute"quota of the Cloud Monitoring API. Before enabling the metrics packages,check your recent peak usageof that quota. If you have many clusters in the same project or are alreadyapproaching that quota limit, you canrequest a quota limit increasebefore enabling either observability package.
Pricing
GKE control plane metrics useGoogle Cloud Managed Service for Prometheus to loadmetrics into Cloud Monitoring. Cloud Monitoring charges for the ingestionof these metrics are based on the number of samples ingested.
For more information, seeCloud Monitoring pricing.
Metric format
All Kubernetes Kubernetes control plane metrics written to Cloud Monitoringuse the resource typeprometheus_target.Each metric name is prefixed withprometheus.googleapis.com/ and has a suffix indicating thePrometheus metric type, such as/gauge,/histogram,or/counter. Otherwise, each metric name isidentical to the metric name exposed by open source Kubernetes.
Exporting from Cloud Monitoring
The Kubernetes control plane metrics can be exported from Cloud Monitoring byusing theCloud Monitoring API.Because all Kubernetes control plane metrics are ingested by usingGoogle Cloud Managed Service for Prometheus,Kubernetes control plane metrics can be queriedby using Prometheus Query Language (PromQL). They can also be queried byusingby using Monitoring Query Language (MQL).
Querying metrics
When you query Kubernetes control plane metrics, the name you use depends onwhether you are using PromQL or Cloud Monitoring-based features likeMQL or the Metrics Explorermenu-driven interface.
The following tables of Kubernetes control plane metrics show two versions of each metric name:
- PromQL metric name: Whenusing PromQL in Cloud Monitoring pages of the Google Cloud console or in PromQL fields of theCloud Monitoring API, use the PromQL metric name.
- Cloud Monitoring metric name When using other Cloud Monitoring features, use the Cloud Monitoring metric name in the tables below. This name must be prefixed with
prometheus.googleapis.com/, which has been omitted from the entries in the table.
API server metrics
This section provides a list of the API server metrics and additionalinformation about interpreting and using the metrics.
List of API server metrics
When API server metrics are enabled, all metrics shown in the following tableare exported to Cloud Monitoring in the same project as theGKE cluster.
The Cloud Monitoring metric names in this table must be prefixed withprometheus.googleapis.com/. That prefix has been omitted from theentries in the table.
| PromQL metric name Launch stage Cloud Monitoring metric name | |
|---|---|
| Kind, Type, Unit Monitored resources Required GKE version | Description Labels |
apiserver_current_inflight_requestsGAapiserver_current_inflight_requests/gauge | |
Gauge, Double, 1prometheus_target 1.22.13+ | Maximal number of currently used inflight request limit of this apiserver per request kind in last second.request_kind |
apiserver_flowcontrol_current_executing_seatsBETAapiserver_flowcontrol_current_executing_seats/gauge | |
Gauge, Double, 1prometheus_target 1.28.3+ | Concurrency (number of seats) occupied by the currently executing (initial stage for a WATCH, any stage otherwise) requests in the API Priority and Fairness subsystem.flow_schemapriority_level |
apiserver_flowcontrol_current_inqueue_requestsBETAapiserver_flowcontrol_current_inqueue_requests/gauge | |
Gauge, Double, 1prometheus_target 1.28.3+ (1.25.16-gke.1360000+, 1.26.11+, 1.27.8+ for prior minor versions) | Number of requests currently pending in queues of the API Priority and Fairness subsystem.flow_schemapriority_level |
apiserver_flowcontrol_nominal_limit_seatsBETAapiserver_flowcontrol_nominal_limit_seats/gauge | |
Gauge, Double, 1prometheus_target 1.28.3+ (1.26.11+, 1.27.8+ for prior minor versions) | Nominal number of execution seats configured for each priority level.priority_level |
apiserver_flowcontrol_rejected_requests_totalBETAapiserver_flowcontrol_rejected_requests_total/counter | |
Cumulative, Double, 1prometheus_target 1.28.3+ (1.25.16-gke.1360000+, 1.26.11+, 1.27.8+ for prior minor versions) | Number of requests rejected by the API Priority and Fairness subsystem.flow_schemapriority_levelreason |
apiserver_flowcontrol_request_wait_duration_secondsBETAapiserver_flowcontrol_request_wait_duration_seconds/histogram | |
Cumulative, Distribution, sprometheus_target 1.28.3+ (1.25.16-gke.1360000+, 1.26.11+, 1.27.8+ for prior minor versions) | Length of time a request spent waiting in its queue.executeflow_schemapriority_level |
apiserver_request_duration_secondsGAapiserver_request_duration_seconds/histogram | |
Cumulative, Distribution, sprometheus_target 1.23.6+ | Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.componentdry_rungroupresourcescopesubresourceverbversion |
apiserver_request_totalGAapiserver_request_total/counter | |
Cumulative, Double, 1prometheus_target 1.22.13+ | Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.codecomponentdry_rungroupresourcescopesubresourceverbversion |
apiserver_response_sizesGAapiserver_response_sizes/histogram | |
Cumulative, Distribution, 1prometheus_target 1.22.13+ | Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component.componentgroupresourcescopesubresourceverbversion |
apiserver_storage_objectsGAapiserver_storage_objects/gauge | |
Gauge, Double, 1prometheus_target 1.22.13+ | Number of stored objects at the time of last check split by kind.resource |
apiserver_admission_controller_admission_duration_secondsGAapiserver_admission_controller_admission_duration_seconds/histogram | |
Cumulative, Distribution, sprometheus_target 1.23.6+ | Admission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).nameoperationrejectedtype |
apiserver_admission_step_admission_duration_secondsGAapiserver_admission_step_admission_duration_seconds/histogram | |
Cumulative, Distribution, sprometheus_target 1.22.13+ | Admission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit).operationrejectedtype |
apiserver_admission_webhook_admission_duration_secondsGAapiserver_admission_webhook_admission_duration_seconds/histogram | |
Cumulative, Distribution, sprometheus_target 1.22.13+ | Admission webhook latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).nameoperationrejectedtype |
This following sections provide additional information about the API servermetrics.
apiserver_request_duration_seconds
Use this metric to monitor latency in the API server. The request durationrecorded by this metric includes all phases of request processing, from the timethe request is received to the time the server completes its response to theclient. Specifically, it includes time spent on the following:
- The authentication and authorization of the request.
- Calling the third-party and system webhooks associated with the request.
- Fetching the requested object from an in-memory cache (for requestsspecifying a
resourceVersionURL parameter) or from theetcd- or Spanner-based cluster state database by calling theetcdAPI (for all other requests). - You can use the
group,version,resource, andsubresourcelabels to uniquelyidentify a slow request for further investigation. - Writing the response to the client and receiving the client's response.
For more information about using this metric, seeLatency.
This metric has very high cardinality. When using this metric, you must usefilters or grouping to find specific sources of latency.
apiserver_admission_controller_admission_duration_seconds
This metric measures the latency inbuilt-in admission webhooks, notthird-party webhooks. To diagnose latency issues with third-party webooks, usetheapiserver_admission_webhook_admission_duration_secondsmetric.
apiserver_admission_webhook_admission_duration_seconds and
apiserver_admission_step_admission_duration_seconds
These metrics measure the latency in external, third-party admission webhooks.Theapiserver_admission_webhook_admission_duration_seconds metricis generally the more useful metric. For more information about using thismetric, seeLatency.
apiserver_request_total
Use this metric to monitor the request traffic at your API server. You can alsouse it to determine the success and failure rates of your requests. For moreinformation about using this metric, seeTraffic and error rate.
This metric has very high cardinality. When using this metric, you must usefilters or grouping to identify sources of errors.
apiserver_storage_objects
Use this metric to detect saturation of your system and to identify possibleresource leaks. For more information, seeSaturation.
apiserver_current_inflight_requests
This metric records the maximum number of requests that were being activelyserved in the last one-second window. For more information, seeSaturation.
The metric does not include long-running requests like "watch".
Monitoring the API server
The API server metrics can give you insight into the main signals for systemhealth:
- Latency: How long does it take to servicea request?
- Traffic: How much demand is the systemexperiencing?
- Error rate: How often to requests fail?
- Saturation: How full is the system?
This section describes how to use the API server metrics to monitor the healthof your API server.
Latency
When the API server is overloaded, request latency increases. To measure thelatency of requests to the API server, use theapiserver_request_duration_secondsmetric. To identify the source of latency more specifically, you can groupmetrics by theverb orresource label.
The suggested upper bound for a single-resource call such as GET, POST, or PATCHis one second. The suggested upper bound for both namespace-scoped andcluster-scoped LIST calls is 30 seconds. The upper-bound expectations are set bySLOs that are defined by the open source Kubernetes community. For moreinformation, seeAPI call latency SLIs/SLOs details.
If the value of theapiserver_request_duration_secondsmetric is increasing beyond the expected duration, investigate the followingpossible causes:
- The Kubernetes control plane might be overloaded. To check, look at the
apiserver_request_totalandapiserver_storage_objectsmetrics.- Use the
codelabel to determine whether requests arebeing processed successfully. For information about the possiblevalues, seeHTTP Status codes. - Use the
group,version,resource,andsubresourcelabels to uniquely identify a request.
- Use the
A third-party admission webhook is slow or non-responsive.If the value of the
apiserver_admission_webhook_admission_duration_secondsmetricis increasing, then some of your third-party oruser-defined admission webhooks are slow or non-responsive. Latency inadmission webhook can cause delays in job scheduling.To query the 99th percentile webhook latency per instance of theKubernetes control plane, use the following PromQL query:
sum by (instance) (histogram_quantile(0.99, rate(apiserver_admission_webhook_admission_duration_seconds_bucket{cluster="CLUSTER_NAME"}[1m])))We recommend also looking at the 50th, 90th, 95th, and 99.9thpercentiles; you can adjust this query by modifying the
0.99value.External webhooks have a timeout limit of approximately 10 seconds.You can set alerting policies on the
apiserver_admission_webhook_admission_duration_secondsmetric to alert you when you are approaching the webhook timeout.You can also group the
apiserver_admission_webhook_admission_duration_secondsmetric on thenamelabel to diagnose possible issues withspecific webhooks.
You are listing a lot of objects. It is expected that the latency ofLIST calls increases as the number of objects of a given type (the responsesize) increases.
Client-side problems:
- The client might not have enough resources to receive responsesin a timely manner. To check, look at CPU usage metrics for the clientpod.
- The client has a slow network connection. This might happen when theclient is running on a device like a mobile phone, but it's unlikelyfor clients running on a Compute Engine network.
- The client has exited unexpectedly but the TCP connection hasa timeout period in tens of seconds. Before the connection times out,the server's resources are blocked, which can increase latency.
For more information, seeGood practices for using API Priority and Fairnessin the Kubernetes documentation.
Traffic and error rate
To measure the traffic and the number of successful and failed requests at theAPI server, use theapiserver_request_total metric. Forexample, to measure the API server traffic per instance of the Kubernetescontrol plane, use the following PromQL query:
sum by (instance) (increase(apiserver_request_total{cluster="CLUSTER_NAME"}[1m]))To query the unsuccessful requests, filter the
codelabel for4xx and 5xx values by using the following PromQL query:sum(rate(apiserver_request_total{code=~"[45].."}[5m]))To query the successful requests, filter the
codelabel for2xx values by using the following PromQL query:sum(rate(apiserver_request_total{code=~"2.."}[5m]))To query the rejected requests by the API server per instance of theKubernetes control plane, filter the
codelabel for the value 429 (http.StatusTooManyRequests)by using the following PromQL query:sum by (instance) (increase(apiserver_request_total{cluster="CLUSTER_NAME", code="429"}[1m]))
Saturation
You can measure the saturation in your system by using theapiserver_current_inflight_requestsandapiserver_storage_objectsmetrics.
If the value of theapiserver_storage_objectsmetric is increasing, you might be experiencing a problem with a customcontroller that creates objects but doesn't delete them. You can filter or groupthe metric by theresource label to identify the resourceexperiencing/ the increase.
Evaluate theapiserver_current_inflight_requests metric inaccordance with your API Priority and Fairness settings; these settings affecthow requests are prioritized, so you can't draw conclusions from the metricvalues alone. For more information, seeAPI Priority and Fairness.
Scheduler metrics
This section provides a list of the scheduler metrics and additional informationabout interpreting and using the metrics.
List of scheduler metrics
When scheduler metrics are enabled, all metrics shown in the following table areexported to Cloud Monitoring in the same project as the GKEcluster.
The Cloud Monitoring metric names in this table must be prefixed withprometheus.googleapis.com/. That prefix has been omitted from theentries in the table.
| PromQL metric name Launch stage Cloud Monitoring metric name | |
|---|---|
| Kind, Type, Unit Monitored resources Required GKE version | Description Labels |
kube_pod_resource_limitGAkube_pod_resource_limit/gauge | |
Gauge, Double, 1prometheus_target 1.31.1-gke.1621000+ | Resource limit for workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resource, along with the unit of the resource, if any.namespacenodepodpriorityresourceschedulerunit |
kube_pod_resource_requestGAkube_pod_resource_request/gauge | |
Gauge, Double, 1prometheus_target 1.31.1-gke.1621000+ | Resources requested by workloads on the cluster, broken down by pod. This shows the resource usage the scheduler and kubelet expect per pod for resource, along with the unit of the resource, if any.namespacenodepodpriorityresourceschedulerunit |
scheduler_pending_podsGAscheduler_pending_pods/gauge | |
Gauge, Double, 1prometheus_target 1.22.13+ | Number of pending pods, by the queue type. 'active' means number of pods in activeQ; 'backoff' means number of pods in backoffQ; 'unschedulable' means number of pods in unschedulablePods.queue |
scheduler_pod_scheduling_duration_secondsDEPRECATEDscheduler_pod_scheduling_duration_seconds/histogram | |
Cumulative, Distribution, 1prometheus_target 1.25.1 to 1.29 (1.22.17-gke.3100+, 1.23.11+, and 1.24.5+ for prior minor versions) | [Deprecated in v. 1.29; removed in v. 1.30 and replaced byscheduler_pod_scheduling_sli_duration_seconds.] E2e latency for a pod being scheduled which may include multiple scheduling attempts.attempts |
scheduler_pod_scheduling_sli_duration_secondsBETAscheduler_pod_scheduling_sli_duration_seconds/histogram | |
Cumulative, Distribution, 1prometheus_target 1.30+ | E2e latency for a pod being scheduled, from the time the pod enters the scheduling queue, and might involve multiple scheduling attempts.attempts |
scheduler_preemption_attempts_totalGAscheduler_preemption_attempts_total/counter | |
Cumulative, Double, 1prometheus_target 1.22.13+ | Total preemption attempts in the cluster till now |
scheduler_preemption_victimsGAscheduler_preemption_victims/histogram | |
Cumulative, Distribution, 1prometheus_target 1.22.13+ | Number of selected preemption victims |
scheduler_scheduling_attempt_duration_secondsGAscheduler_scheduling_attempt_duration_seconds/histogram | |
Cumulative, Distribution, 1prometheus_target 1.23.6+ | Scheduling attempt latency in seconds (scheduling algorithm + binding).profileresult |
scheduler_schedule_attempts_totalGAscheduler_schedule_attempts_total/counter | |
Cumulative, Double, 1prometheus_target 1.22.13+ | Number of attempts to schedule pods, by the result. 'unschedulable' means a pod could not be scheduled, while 'error' means an internal scheduler problem.profileresult |
This following sections provide additional information about the API servermetrics.
scheduler_pending_pods
You can use thescheduler_pending_pods metric to monitor the loadon your scheduler. Increasing values in this metric can indicate resourcingproblems. The scheduler has three queues, and this metric reports the number ofpending requests by queue. The following queues are supported:
activequeue- The set of pods that the scheduler is attempting to schedule; the podwith the highest priority is at the head of the queue.
backoffqueue- The set of pods were unschedulable the last time the scheduler triedbut which might be schedulable the next time.
- Pods on this queue must wait for a backoff period (a maximum of 10seconds), after which they are moved back to the
activequeue for another scheduling attempt. For more information on themanagement of thebackoffqueue,see the implementation request,Kubernetes issue 75417.
unschedulablesetThe set of pods that the scheduler attempted to schedule but whichhave been determined to be unschedulable. Placementon this queue might indicate readiness or compatibility issues with yournodes or the configuration of your node selectors.
When resource constraints prevent pods from being scheduled, the pods arenot subject to back-off handling. Instead, when a cluster is full, newpods fail to be scheduled and are put on the
unscheduledqueue.The presence of unscheduled pods might indicate that you haveinsufficient resources or that you have a node-configuration problem.Pods are moved to either the
backofforactivequeue after events that change the cluster state. Pods on this queueindicate that nothing has changed in the cluster that would make thepods schedulable.Affinitiesdefine rules for how pods are assigned to nodes. The use of affinityor anti-affinity rules can be a reason for an increase in unscheduledpods.
Some events, for example, PVC/Service ADD/UPDATE, termination of a pod,or the registration of new nodes, move some or all unscheduledpods to either the
backofforactivequeue.For more information, seeKubernetes issue 81214.
For more information, seeScheduler latency andResource issues.
scheduler_scheduling_attempt_duration_seconds
This metric measures the duration of a single scheduling attempt within thescheduler itself and is broken down by the result: scheduled, unschedulable, orerror. The duration runs from the time the scheduler picks up a pod until thetime the scheduler locates a node and places the pod on the node, determinesthat the pod is unschedulable, or encounters an error. The scheduling durationincludes the time in the scheduling process as well as the binding time. Bindingis the process in which the scheduler communicates its node assignment to theAPI server. For more information, seeScheduler latency.
This metric doesn't capture the time the pod spends in admission control orvalidation.
For more information about scheduling, seeScheduling a Pod.
scheduler_schedule_attempts_total
This metric measures the number of scheduling attempts; each attempt to schedulea pod increases the value. You can use this metric to determine if the scheduleris available: if the value is increasing, then the scheduler is operational. Youcan use theresult label to determine the success; pods are eitherscheduled orunschedulable.
This metric correlates strongly with thescheduler_pending_podsmetric: when there are many pending pods, you can expect to see many attempts toschedule the pods. For more information, seeResourceissues.
This metric doesn't increase if the scheduler has no pods to schedule, which canbe the case if you have a custom secondary scheduler.
scheduler_preemption_attempts_total andscheduler_preemptions_victims
You can use preemption metrics to help determine if you need to add resources.
You might have higher-priority pods that can't be scheduled because there is noroom for them. In this case, the scheduler frees up resources by preempting oneor more running pods on a node. Thescheduler_preemption_attempts_totalmetric tracks the number of times the scheduler has tried to preempt pods.
Thescheduler_preemptions_victims metric counts the pods selectedfor preemption.
The number of preemption attempts correlates strongly with the value of thescheduler_schedule_attempts_totalmetric when the value of theresult label isunschedulable.The two values aren't equivalent: for example, if a cluster has 0 nodes, thereare no preemption attempts but there might be scheduling attempts that fail.
For more information, seeResource issues.
Monitoring the scheduler
The scheduler metrics can give you insight into the performance of yourscheduler:
- Scheduler latency: Is the scheduler running? Howlong does it take to schedule pods?
- Resource issues: Are attempts to schedule podshitting resource constraints?
This section describes how to use the scheduler metric to monitor yourscheduler.
Scheduler latency
The scheduler's task is to ensure that your pods run, so you want to knowwhen the scheduler is stuck or running slowly.
- To verify that the scheduler is running and scheduling pods, use the
scheduler_schedule_attempts_totalmetric. When the scheduler is running slowly, investigate the following possiblecauses:
The number of pending pods is increasing. Use the
scheduler_pending_podsmetricto monitor the number of pending pods. The following PromQL queryreturns the number of pending pods per queue in a cluster:sum by (queue)(delta(scheduler_pending_pods{cluster="CLUSTER_NAME"}[2m]))Individual attempts to schedule pods are slow. Use the
scheduler_scheduling_attempt_duration_secondsmetric to monitor the latency of scheduling attempts.We recommend observing this metric at least at the 50th and 95thpercentiles. The following PromQL query retrieves 95th percentile valuesbut can be adjusted:
sum by (instance) (histogram_quantile(0.95, rate(scheduler_scheduling_attempt_duration_seconds_bucket{cluster="CLUSTER_NAME"}[5m])))
Resource issues
The scheduler metrics can also help you assess whether you have sufficientresources. If the value of thescheduler_preemption_attempts_total metricis increasing, then check the value ofscheduler_preemption_victims by using the following PromQLquery:
scheduler_preemption_victims_sum{cluster="CLUSTER_NAME"}The number of preemption attempts and the number of preemption victims bothincrease when there are higher priority pods to schedule. The preemption metricsdon't tell you whether the high-priority pods that triggered the preemptionswere scheduled, so when you see increases in the value of the preemptionmetrics, you can also monitor the value of thescheduler_pending_pods metric. If the number of pendingpods is also increasing, then you might not have sufficient resources to handlethe higher-priority pods; you might need to scale up the available resources,create new pods with reduced resource claims, or change the node selector.
If the number of preemption victims is not increasing, thenthere are no remaining pods with low priority that can be removed.In this case, consider adding more nodes so the new pods can beallocated.
If the number of preemption victims is increasing, then thereare higher-priority pods waiting to be scheduled, so the scheduleris preempting some of the running pods. The preemption metricsdon't tell you whether the higher priority pods have been scheduledsuccessfully.
To determine if the higher-priority pods are being scheduled,look for decreasing values of the
scheduler_pending_podsmetric. If the value of this metric is increasing, then you mightneed to add more nodes.
You can expect to see temporary spikes in the values for thescheduler_pending_pods metric when workloads are going to bescheduled in your cluster, for example, during events like updates or scalings.If you have sufficient resources in your cluster, these spikes are temporary.If the number of pending pods doesn't go down, do the following:
- Check that nodes are not cordoned; cordoned nodes don't accept new pods.
- Check the following scheduling directives, which can be misconfigured andmight render a pod unschedulable:
- Node affinity and selector.
- Taints and tolerations.
- Pod topology-spread constraints.
If pods can't be scheduled because of insufficient resources, thenconsider freeing up some of the existing nodes or increasing the number ofnodes.
Controller Manager metrics
When controller manager metrics are enabled, all metrics shown in the followingtable are exported to Cloud Monitoring in the same project as theGKE cluster.
The Cloud Monitoring metric names in this table must be prefixed withprometheus.googleapis.com/. That prefix has been omitted from theentries in the table.
| PromQL metric name Launch stage Cloud Monitoring metric name | |
|---|---|
| Kind, Type, Unit Monitored resources Required GKE version | Description Labels |
node_collector_evictions_totalGAnode_collector_evictions_total/counter | |
Cumulative, Double, 1prometheus_target 1.24+ | Number of Node evictions that happened since current instance of NodeController started.zone |
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-10-24 UTC.