Optimize Pod autoscaling based on metrics

This tutorial demonstrates how to automatically scale yourGoogle Kubernetes Engine (GKE) workloads based on metrics available inCloud Monitoring.

In this tutorial, you can set up autoscaling based on one of the followingmetrics:

Pub/Sub

Pub/Sub backlog

Scale based on an external metric reporting the number of unacknowledgedmessages remaining in aPub/Sub subscription. This can effectivelyreduce latency before it becomes a problem, but might use relatively moreresources than autoscaling based on CPU utilization.

Custom Metric

Custom Prometheus Metric

Scale based on a custom user-defined metric, exported in thePrometheus format via Google Managed Prometheus. YourPrometheus metric must be of typeGauge.

Autoscaling is fundamentally about finding an acceptable balance between costand latency. You might want to experiment with a combination of these metricsand othersto find a policy that works for you.

Objectives

This tutorial covers the following tasks:

How to deploy theCustom Metrics Adapter.
How to export metrics from within your application code.
How to view your metrics on the Cloud Monitoring interface.
How to deploy aHorizontalPodAutoscaler (HPA)resource to scale your application based on Cloud Monitoring metrics.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use thepricing calculator.

New Google Cloud users might be eligible for afree trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, seeClean up.

Before you begin

Take the following steps to enable the Kubernetes Engine API:

Visit the Kubernetes Engine page in the Google Cloud console.
Create or select a project.
Wait for the API and related services to be enabled. This can take several minutes.
Verify that billing is enabled for your Google Cloud project.

You can follow this tutorial usingCloud Shell, which comespreinstalled with thegcloud andkubectl command-line tools usedin this tutorial. If you use Cloud Shell, you don't need to install thesecommand-line tools on your workstation.

To use Cloud Shell:

Go to theGoogle Cloud console.
Click theActivate Cloud Shellbutton at the top of the Google Cloud console window.
A Cloud Shell session opens inside a new frame at the bottomof the Google Cloud console and displays a command-line prompt.

Setting up your environment

Set the default zone for the Google Cloud CLI:
```
gcloud config set compute/zonezone
```
Replace the following:
- zone: Choose a zone that's closest to you.For more information, seeRegions and Zones.

Set thePROJECT_ID andPROJECT_NUMBER environment variables to yourGoogle Cloud project ID and project number:

export PROJECT_ID=project-idexport PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format 'get(projectNumber)')

Set the default zone for the Google Cloud CLI:
```
gcloudconfigsetproject$PROJECT_ID
```
Create a GKEcluster
Best practice:
For enhanced security when accessing Google Cloud services, enableWorkload Identity Federation for GKE on your cluster. Althoughthis page includes examples using the legacy method (with Workload Identity Federation for GKE disabled), enabling it improves protection.
Workload Identity
To create a cluster with Workload Identity Federation for GKE enabled, run the followingcommand:
```
gcloudcontainerclusterscreatemetrics-autoscaling--workload-pool=$PROJECT_ID.svc.id.goog
```
Legacy authentication
To create a cluster with Workload Identity Federation for GKEdisabled, run thefollowing command:
```
gcloudcontainerclusterscreatemetrics-autoscaling
```

Deploying the Custom Metrics Adapter

TheCustom Metrics Adapter lets your cluster send and receivemetrics with Cloud Monitoring.

Pub/Sub

The procedure to install the Custom Metrics Adapter differs for clusters withor without Workload Identity Federation for GKE enabled. Select the option matching the setupyou chose when youcreated your cluster.

Workload Identity

Grant your user the ability to create required authorization roles:

kubectlcreateclusterrolebindingcluster-admin-binding\--clusterrolecluster-admin--user"$(gcloudconfigget-valueaccount)"

Deploy the custom metrics adapter on your cluster:

kubectlapply-fhttps://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml

The adapter uses thecustom-metrics-stackdriver-adapter Kubernetesservice account in thecustom-metrics namespace. Allow this serviceaccount to read Cloud Monitoring metrics by assigning theMonitoring Viewer role:

gcloudprojectsadd-iam-policy-bindingprojects/$PROJECT_ID\--roleroles/monitoring.viewer\--member=principal://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$PROJECT_ID.svc.id.goog/subject/ns/custom-metrics/sa/custom-metrics-stackdriver-adapter

Legacy Authentication

Grant your user the ability to create required authorization roles:

kubectlcreateclusterrolebindingcluster-admin-binding\--clusterrolecluster-admin--user"$(gcloudconfigget-valueaccount)"

Deploy the custom metrics adapter on your cluster:

kubectlapply-fhttps://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml

Custom Metric

Workload Identity

Grant your user the ability to create required authorization roles:

kubectlcreateclusterrolebindingcluster-admin-binding\--clusterrolecluster-admin--user"$(gcloudconfigget-valueaccount)"

Deploy the custom metrics adapter on your cluster:

kubectlapply-fhttps://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml

gcloudprojectsadd-iam-policy-bindingprojects/$PROJECT_ID\--roleroles/monitoring.viewer\--member=principal://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$PROJECT_ID.svc.id.goog/subject/ns/custom-metrics/sa/custom-metrics-stackdriver-adapter

Legacy Authentication

Grant your user the ability to create required authorization roles:

kubectlcreateclusterrolebindingcluster-admin-binding\--clusterrolecluster-admin--user"$(gcloudconfigget-valueaccount)"

Deploy the custom metrics adapter on your cluster:

kubectlapply-fhttps://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter_new_resource_model.yaml

Deploying an application with metrics

Download the repository containing the application code for this tutorial:

Pub/Sub

gitclonehttps://github.com/GoogleCloudPlatform/kubernetes-engine-samples.gitcdkubernetes-engine-samples/databases/cloud-pubsub

Custom Metric

gitclonehttps://github.com/GoogleCloudPlatform/kubernetes-engine-samples.gitcdkubernetes-engine-samples/observability/custom-metrics-autoscaling/google-managed-prometheus

The repository contains code that exports metrics to Cloud Monitoring:

Pub/Sub

This application polls aPub/Sub subscriptionfor new messages, acknowledging them as they arrive. Pub/Subsubscription metrics are automatically collected by Cloud Monitoring.

fromgoogleimportauthfromgoogle.cloudimportpubsub_v1defmain():"""Continuously pull messages from subsciption"""# read default project ID_,project_id=auth.default()subscription_id='echo-read'subscriber=pubsub_v1.SubscriberClient()subscription_path=subscriber.subscription_path(project_id,subscription_id)defcallback(message:pubsub_v1.subscriber.message.Message)->None:"""Process received message"""print(f"Received message: ID={message.message_id} Data={message.data}")print(f"[{datetime.datetime.now()}] Processing:{message.message_id}")time.sleep(3)print(f"[{datetime.datetime.now()}] Processed:{message.message_id}")message.ack()streaming_pull_future=subscriber.subscribe(subscription_path,callback=callback)print(f"Pulling messages from{subscription_path}...")withsubscriber:try:streaming_pull_future.result()exceptExceptionase:print(e)

Custom Metric

This application responds to any web request to the/metrics path with aconstant value metric using thePrometheus format.

metric:=prometheus.NewGauge(prometheus.GaugeOpts{Name:*metricName,Help:"Custom metric",},)prometheus.MustRegister(metric)metric.Set(float64(*metricValue))http.Handle("/metrics",promhttp.Handler())log.Printf("Starting to listen on :%d",*port)err:=http.ListenAndServe(fmt.Sprintf(":%d",*port),nil)

The repository also contains a Kubernetes manifest to deploy the application toyour cluster. ADeployment is a Kubernetes API object that lets you run multiple replicas of Pods that are distributed among the nodes in a cluster.:

Pub/Sub

The manifest differs for clusters with or without Workload Identity Federation for GKEenabled. Select the option matching the setup chose when youcreated your cluster.

Workload Identity

apiVersion:apps/v1kind:Deploymentmetadata:name:pubsubspec:selector:matchLabels:app:pubsubtemplate:metadata:labels:app:pubsubspec:serviceAccountName:pubsub-sacontainers:-name:subscriberimage:us-docker.pkg.dev/google-samples/containers/gke/pubsub-sample:v2

Legacy authentication

apiVersion:apps/v1kind:Deploymentmetadata:name:pubsubspec:selector:matchLabels:app:pubsubtemplate:metadata:labels:app:pubsubspec:volumes:-name:google-cloud-keysecret:secretName:pubsub-keycontainers:-name:subscriberimage:us-docker.pkg.dev/google-samples/containers/gke/pubsub-sample:v2volumeMounts:-name:google-cloud-keymountPath:/var/secrets/googleenv:-name:GOOGLE_APPLICATION_CREDENTIALSvalue:/var/secrets/google/key.json

Custom Metric

apiVersion:apps/v1kind:Deploymentmetadata:labels:run:custom-metrics-gmpname:custom-metrics-gmpnamespace:defaultspec:replicas:1selector:matchLabels:run:custom-metrics-gmptemplate:metadata:labels:run:custom-metrics-gmpspec:containers:# sample container generating custom metrics-name:prometheus-dummy-exporterimage:us-docker.pkg.dev/google-samples/containers/gke/prometheus-dummy-exporter:v0.2.0command:["./prometheus-dummy-exporter"]args:---metric-name=custom_prometheus---metric-value=40---port=8080

With thePodMonitoring resource, the Google Cloud Managed Service for Prometheus exports thePrometheus metrics to Cloud Monitoring:

apiVersion:monitoring.googleapis.com/v1kind:PodMonitoringmetadata:name:"custom-metrics-exporter"spec:selector:matchLabels:run:custom-metrics-gmpendpoints:-port:8080path:/metricsinterval:15s

Starting in GKE Standard version 1.27 orGKE Autopilot version 1.25,Google Cloud Managed Service for Prometheus is enabled. To enableGoogle Cloud Managed Service for Prometheus in clusters in earlier versions, seeEnable managed collection.

Deploy the application to your cluster:

Pub/Sub

The procedure to deploy your application differs for clusters with orwithout Workload Identity Federation for GKE enabled. Select the option matching the setupyou chose when youcreated your cluster.

Workload Identity

Enable the Pub/Sub API on your project:

gcloudservicesenablecloudresourcemanager.googleapis.compubsub.googleapis.com

Create a Pub/Sub topic and subscription:

gcloudpubsubtopicscreateechogcloudpubsubsubscriptionscreateecho-read--topic=echo

Deploy the application to your cluster:

kubectlapply-fdeployment/pubsub-with-workload-identity.yaml

This application defines apubsub-sa Kubernetes service account. Assign itthePub/Sub subscriber role so that the application can publish messages tothe Pub/Sub topic.
```
gcloudprojectsadd-iam-policy-bindingprojects/$PROJECT_ID\--role=roles/pubsub.subscriber\--member=principal://iam.googleapis.com/projects/$PROJECT_NUMBER/locations/global/workloadIdentityPools/$PROJECT_ID.svc.id.goog/subject/ns/default/sa/pubsub-sa
```
The preceding command uses aPrincipal Identifier, which allowsIAM to directly refer to a Kubernetes service account.
Best practice:
Use Principal identifiers, but consider thelimitation inthe description of an alternative method.

Legacy authentication

Enable the Pub/Sub API on your project:

gcloudservicesenablecloudresourcemanager.googleapis.compubsub.googleapis.com

Create a Pub/Sub topic and subscription:

gcloudpubsubtopicscreateechogcloudpubsubsubscriptionscreateecho-read--topic=echo

Create a service account with access to Pub/Sub:

gcloudiamservice-accountscreateautoscaling-pubsub-sagcloudprojectsadd-iam-policy-binding$PROJECT_ID\--member"serviceAccount:autoscaling-pubsub-sa@$PROJECT_ID.iam.gserviceaccount.com"\--role"roles/pubsub.subscriber"

Download the service account key file:

gcloudiamservice-accountskeyscreatekey.json\--iam-accountautoscaling-pubsub-sa@$PROJECT_ID.iam.gserviceaccount.com

Import the service account key to your cluster as aSecret:

kubectlcreatesecretgenericpubsub-key--from-file=key.json=./key.json

Deploy the application to your cluster:

kubectlapply-fdeployment/pubsub-with-secret.yaml

Custom Metric

kubectlapply-fcustom-metrics-gmp.yaml

After waiting a moment for the application to deploy, all Pods reach theReadystate:

Pub/Sub

kubectlgetpods

Output:

NAME                     READY   STATUS    RESTARTS   AGEpubsub-8cd995d7c-bdhqz   1/1     Running   0          58s

Custom Metric

kubectlgetpods

Output:

NAME                                  READY   STATUS    RESTARTS   AGEcustom-metrics-gmp-865dffdff9-x2cg9   1/1     Running   0          49s

Viewing metrics on Cloud Monitoring

As your application runs, it writes your metrics to Cloud Monitoring.

To view the metrics for a monitored resource by using theMetrics Explorer, do the following:

In the Google Cloud console, go to the Metrics explorer page:
Go toMetrics explorer
If you use the search bar to find this page, then select the result whose subheading isMonitoring.
In theMetric element, expand theSelect a metric menu, and then select a resource type and metric type. For example, to chart the CPU utilization of a virtual machine, do the following:
1. (Optional) To reduce the menu's options, enter part of the metric name in theFilter bar. For this example, enterutilization.
2. In theActive resources menu, selectVM instance.
3. In theActive metric categories menu, selectInstance.
4. In theActive metrics menu, selectCPU utilization and then clickApply.
To filter which time series are displayed, use theFilter element.
To combine time series, use the menus on theAggregation element. For example, to display the CPU utilization for your VMs, based on their zone, set the first menu toMean and the second menu tozone.
All time series are displayed when the first menu of theAggregation element is set toUnaggregated. The default settings for theAggregation element are determined by the metric type you selected.

The resource type and metrics are the following:

Pub/Sub

Metrics Explorer

Resource type:pubsub_subscription

Metric:pubsub.googleapis.com/subscription/num_undelivered_messages

Custom Metric

Metrics Explorer

Resource type:prometheus_target

Metric:prometheus.googleapis.com/custom_prometheus/gauge

Depending on the metric, you might not see much activity on theCloud Monitoring Metrics Explorer yet. Don't be surprised if your metricisn't updating.

Creating a HorizontalPodAutoscaler object

When you see your metric in Cloud Monitoring, you can deploy aHorizontalPodAutoscaler to resize your Deployment based on your metric.

Pub/Sub

apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:pubsubspec:minReplicas:1maxReplicas:5metrics:-external:metric:name:pubsub.googleapis.com|subscription|num_undelivered_messagesselector:matchLabels:resource.labels.subscription_id:echo-readtarget:type:AverageValueaverageValue:2type:ExternalscaleTargetRef:apiVersion:apps/v1kind:Deploymentname:pubsub

Custom Metric

apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:custom-metrics-gmp-hpanamespace:defaultspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:custom-metrics-gmpminReplicas:1maxReplicas:5metrics:-type:Podspods:metric:name:prometheus.googleapis.com|custom_prometheus|gaugetarget:type:AverageValueaverageValue:20

Deploy theHorizontalPodAutoscaler to your cluster:

Pub/Sub

kubectlapply-fdeployment/pubsub-hpa.yaml

Custom Metric

kubectlapply-fcustom-metrics-gmp-hpa.yaml

Generating load

For some metrics, you might need to generate load to watch the autoscaling:

Pub/Sub

Publish 200 messages to the Pub/Sub topic:

foriin{1..200};dogcloudpubsubtopicspublishecho--message="Autoscaling #${i}";done

Custom Metric

Not Applicable: The code used in this sample exports a constant value of40for the custom metric. The HorizontalPodAutoscaler is set with atarget value of20, so it attempts to scale up the Deploymentautomatically.

You might need to wait a couple minutes for the HorizontalPodAutoscaler torespond to the metric changes.

Observing HorizontalPodAutoscaler scaling up

You can check the current number of replicas of your Deployment by running:

kubectlgetdeployments

After giving some time for the metric to propagate, theDeployment creates five Pods to handle the backlog.

You can also inspect the state and recent activity of theHorizontalPodAutoscaler by running:

kubectldescribehpa

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Pub/Sub

Clean up the Pub/Sub subscription and topic:

gcloudpubsubsubscriptionsdeleteecho-readgcloudpubsubtopicsdeleteecho

Delete your GKE cluster:

gcloudcontainerclustersdeletemetrics-autoscaling

Custom Metric

Delete your GKE cluster:

gcloudcontainerclustersdeletemetrics-autoscaling

What's next

Learn more abouthorizontal Pod autoscaling.
Learn more aboutautoscaling workloads based on metrics.
Learn how toconfigure horizontal Pod autoscaling.
Learn how touse the horizontal Pod autoscaler on Pub/Sub.
Explore otherKubernetes Engine tutorials.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Optimize Pod autoscaling based on metrics Stay organized with collections Save and categorize content based on your preferences.

Pub/Sub

Custom Metric

Objectives

Costs

Before you begin

Setting up your environment

Workload Identity

Legacy authentication

Deploying the Custom Metrics Adapter

Pub/Sub

Workload Identity

Legacy Authentication

Custom Metric

Workload Identity

Legacy Authentication

Deploying an application with metrics

Pub/Sub

Custom Metric

Pub/Sub

Custom Metric

Pub/Sub

Workload Identity

Legacy authentication

Custom Metric

Pub/Sub

Workload Identity

Legacy authentication

Custom Metric

Pub/Sub

Custom Metric

Viewing metrics on Cloud Monitoring

Pub/Sub

Custom Metric

Creating a HorizontalPodAutoscaler object

Pub/Sub

Custom Metric

Pub/Sub

Custom Metric

Generating load

Pub/Sub

Custom Metric

Observing HorizontalPodAutoscaler scaling up

Clean up

Pub/Sub

Custom Metric

What's next

Optimize Pod autoscaling based on metrics