Identify underprovisioned and overprovisioned workloads

This document explains how to identify underprovisioned and overprovisionedworkloads that run on Google Kubernetes Engine (GKE) clusters by using insights andrecommendations. After you verify that the identified workloads would benefitfrom the recommendation to scale up or down, you can make the recommended changeto save costs or increase the reliability of your workload. If possible, therecommendation includes projected monthly savings or cost. For more information,seeUnderstand cost or savings estimates.

GKE provides these insights about workloads running on bothAutopilot and Standard clusters. GKE alsoprovides similar recommendations for entire clusters. For more information, seeIdentify underprovisioned and overprovisioned GKEclusters.

GKE monitors your clusters and delivers guidance to optimize yourusage throughActive Assist, aservice that provides recommenders that generate insights and recommendationsfor using resources on Google Cloud. For more information about how to manageinsights and recommendations, seeOptimize your usage of GKEwith insights andrecommendations.

Get insights and recommendations for underprovisioned and overprovisioned workloads

GKE surfaces these insights and recommendations in the followinglocations in the Google Cloud console after observing the specific behaviordiscussed in the following section:

The recommendations have the following titles in theWorkloads page:

  • Overprovisioned workloads: "Decrease resource requests to reduce costs"
  • Underprovisioned workloads: "Increase resource requests to improvereliability"

You can also receive all types of insights and recommendations through theGoogle Cloud CLI or the Recommender API. To find these types specifically,follow the instructions toview insights andrecommendationsand filter using theWORKLOAD_UNDERPROVISIONED andWORKLOAD_OVERPROVISIONEDsubtypes.

After you identify underprovisioned or overprovisioned workloads, see theconsiderations when rightsizingworkloads.

How GKE identifies underprovisioned and overprovisioned workloads

The following table describes the signals that GKE uses foridentifying underprovisioned and overprovisioned workloads that can be scaled upor down, and the threshold for each signal. Additionally, this table shows theaction that we recommend that you take in this scenario.

SubtypeSignalObservation periodDetailsRecommendation
WORKLOAD_UNDERPROVISIONEDCPU or memory usage is highLast 15 daysA workload is underprovisioned when CPU or memory utilization is greater than 150% for at least 10% of the time over the last 15 days.Scale up your workload to increase reliability
WORKLOAD_OVERPROVISIONEDCPU or memory usage is lowLast 15 days A workload is overprovisioned when CPU or memory utilization is less than 50% for at least 90% of the time over the last 15 days.Scale down your workload to save costs

GKE also uses the following guidelines to determine when toprovide insights and recommendations:

  • GKE doesn't generate recommendations for the target metric ofhorizontal Pod autoscaling (HPA) because using this metric can causeinterference.
  • If vertical Pod autoscaling (VPA) is enabled, the request values areautomatically managed and GKE doesn't need to generate arecommendation.
  • GKE might wait up to three days before generatingrecommendations for new workloads.

Understand cost or savings estimates

If possible, GKE's recommendation includes an estimate thatprojects the monthly cost or savings if you rightsized the workload. Thisestimate is derived from the workload costs, based on the weighted average ofthe request values combined with the CPU and memory cost of the workload overthe past 30 days.

Any estimated costs or savings are projections based on previous spending, andare not a guarantee of future cost or savings.

To see these estimates, ensure that the following is true:

  • You have the requiredbilling.accounts.getSpendingInformation permissionto get spending information. For more information, seeCloud Billingaccess.
  • GKE cost allocation is enabled for the cluster. For moreinformation, seeEnable GKE costallocation.

For more information about the cost of all of your GKE clusters,including a more granular breakdown based on namespaces and workloads, seeGetkey spending insights for your GKE resource allocation andcluster costs.

For more information about the costs of running a GKE cluster,seeGKE pricing.

Considerations when rightsizing workloads

Before you follow a recommendation to scale up or down a workload, consider thefollowing:

  • Review the resource utilization of the workload to see how it's performing,and if it's using more or less CPU and memory than expected. Forinstructions, seeAnalyze resourcerequests.
  • Batch processing workloads might intentionally maintain high utilizationfor cost efficiency. If the allocated resources are sufficient for thebatch jobs, you don't need to scale up the highly utilized workload,which was identified as underprovisioned.
  • GKE has limited visibility into the actual memory usage ofJava Virtual Machine (JVM)-based workloads. Use extra scrutiny beforeapplying recommendations for these types of workloads.

Implement the recommendation to rightsize a workload

You can adjust the size of a workload to better match the workload's resourceutilization by doing either of the following:

  • Enablevertical Podautoscaling for theworkload. For more information, seeSet Pod resource requestsautomatically.
  • Change the requests and limitsmanuallyaccording to the recommendation:

    • Underprovisioned workload: to implement the recommendation torightsize an underprovisioned workload, increase the resource requests andlimits for the workload. When you implement this recommendation, you helpto ensure that your workload remains reliable because it has theappropriate amount of resources for its applications.
    • Overprovisioned workload: to implement the recommendation to rightsizean overprovisioned workload, decrease the resource requests and limits forthe workload. Adjust cluster CPU and memory allocations to match yourworkload needs. When you implement this recommendation, you help to ensurethat you use only the resources that you need to run your workload.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.