Configuring horizontal Pod autoscaling

Autopilot Standard

This page shows you how to scale your deployments in Google Kubernetes Engine (GKE)by automatically adjusting your resources using metrics like resource allocation,load balancer traffic, custom metrics, or multiple metrics simultaneously. Thispage also provides step-by-step instructions for configuring aHorizontal Pod Autoscaler (HPA)profile, including how to view, delete, clean, and troubleshoot your HPA object. ADeployment is a Kubernetes API object that lets you run multiple replicas of Pods that are distributed among the nodes in a cluster..

This page is for Operators and Developers who manageapplication scaling in GKEand want to understand how to dynamically optimize performance and maintain costefficiency through horizontal Pod autoscaling. To learn more about common rolesand example tasks referenced in Google Cloudcontent, seeCommon GKE user roles and tasks.

Before you begin

Before you start, make sure that you have performed the following tasks:

Enable the Google Kubernetes Engine API.

Enable Google Kubernetes Engine API

If you want to use the Google Cloud CLI for this task,install and theninitialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running thegcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.Note: For existing gcloud CLI installations, make sure to set thecompute/regionproperty. If you use primarily zonal clusters, set thecompute/zone instead. By setting a default location, you can avoid errors in the gcloud CLI like the following:One of [--zone, --region] must be supplied: Please specify location. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.

Ensure that you have an existing Autopilot or Standardcluster. If you need one,create an Autopilot cluster.

API versions for`HorizontalPodAutoscaler` objects

When you use the Google Cloud console,HorizontalPodAutoscaler objects are created using theautoscaling/v2 API.

When you usekubectl to create or view information about a Horizontal Pod Autoscaler, you canspecify either theautoscaling/v1 API or theautoscaling/v2 API.

apiVersion: autoscaling/v1 is the default, and lets you autoscalebased only on CPU utilization. To autoscale based on other metrics, usingapiVersion: autoscaling/v2 is recommended. The exampleinCreate the example Deployment usesapiVersion: autoscaling/v1.
apiVersion: autoscaling/v2 is recommended for creating newHorizontalPodAutoscalerobjects. It lets you autoscale based on multiple metrics, includingcustom or external metrics. All other examples in this page useapiVersion: autoscaling/v2.

To check which API versions are supported, use thekubectl api-versionscommand.

You can specify which API to use whenviewing details about a Horizontal Pod Autoscaler that usesapiVersion: autoscaling/v2.

Create the example Deployment

Before you can create a Horizontal Pod Autoscaler, you must create the workload it monitors. Theexamples in this page apply different Horizontal Pod Autoscaler configurations to the followingnginx Deployment. Separate examples show a Horizontal Pod Autoscaler based onresource utilization, based on acustom or external metric,and based onmultiple metrics.

Save the following to a file namednginx.yaml:

apiVersion:apps/v1kind:Deploymentmetadata:name:nginxnamespace:defaultspec:replicas:3selector:matchLabels:app:nginxtemplate:metadata:labels:app:nginxspec:containers:-name:nginximage:nginx:1.7.9ports:-containerPort:80resources:# You must specify requests for CPU to autoscale# based on CPU utilizationrequests:cpu:"250m"

This manifest specifies a value for CPU requests. If you want to autoscale basedon a resource's utilization as a percentage, you must specify requests for thatresource. If you don't specify requests, you can autoscale based only on theabsolute value of the resource's utilization, such as milliCPUs forCPU utilization.

To create the Deployment, apply thenginx.yaml manifest:

kubectlapply-fnginx.yaml

The Deployment hasspec.replicas set to 3, so three Pods are deployed.You can verify this using thekubectl get deployment nginx command.

Each of the examples in this page applies a different Horizontal Pod Autoscaler to an example nginxDeployment.

Autoscaling based on resources utilization

This example createsHorizontalPodAutoscaler object to autoscale thenginx Deployment when CPU utilizationsurpasses 50%, and ensures that there is always a minimum of 1replica and a maximum of 10 replicas.

You can create a Horizontal Pod Autoscaler that targets CPU using the Google Cloud console, thekubectl apply command, or for average CPU only, thekubectl autoscalecommand.

Note: This example usesapiVersion: autoscaling/v1. For more information aboutthe available APIs, see API versions forHorizontalPodAutoscaler objects.

Console

Go to theWorkloads page in the Google Cloud console.
Go to Workloads
Click the name of thenginx Deployment.
ClickActions> Autoscale.
Specify the following values:
- Minimum number of replicas: 1
- Maximum number of replicas: 10
- Autoscaling metric: CPU
- Target: 50
- Unit: %
ClickDone.
ClickAutoscale.

`kubectl apply`

Save the following YAML manifest as a file namednginx-hpa.yaml:

apiVersion:autoscaling/v1kind:HorizontalPodAutoscalermetadata:name:nginxspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:nginx# Set the minimum and maximum number of replicas the Deployment can scale to.minReplicas:1maxReplicas:10# The target average CPU utilization percentage across all Pods.targetCPUUtilizationPercentage:50

To create the HPA, apply the manifest using the following command:

kubectlapply-fnginx-hpa.yaml

`kubectl autoscale`

To create aHorizontalPodAutoscaler object that only targets average CPU utilization, you can usethekubectl autoscalecommand:

kubectlautoscaledeploymentnginx--cpu-percent=50--min=1--max=10

Note: You can combine the--dry-run and-o yaml flags to print a YAMLmanifest for a Horizontal Pod Autoscaler without actually creating it.

To get a list of Horizontal Pod Autoscalers in the cluster, use the following command:

kubectlgethpa

The output is similar to the following:

NAME    REFERENCE          TARGETS   MINPODS   MAXPODS   REPLICAS   AGEnginx   Deployment/nginx   0%/50%    1         10        3          61s

To get details about the Horizontal Pod Autoscaler, you can use the Google Cloud console or thekubectl command.

Console

Go to theWorkloads page in the Google Cloud console.
Go to Workloads
Click the name of thenginx Deployment.
View the Horizontal Pod Autoscaler configuration in theAutoscaler section.
View more details about autoscaling events in theEvents tab.

`kubectl get`

To get details about the Horizontal Pod Autoscaler, you can usekubectl get hpa with the-o yamlflag. Thestatus field contains information about the current number ofreplicas and any recent autoscaling events.

kubectlgethpanginx-oyaml

The output is similar to the following:

apiVersion: autoscaling/v1kind: HorizontalPodAutoscalermetadata:  annotations:    autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"True","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"ScaleDownStabilized","message":"recent      recommendations were higher than current one, applying the highest recent recommendation"},{"type":"ScalingActive","status":"True","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"ValidMetricFound","message":"the      HPA was able to successfully calculate a replica count from cpu resource utilization      (percentage of request)"},{"type":"ScalingLimited","status":"False","lastTransitionTime":"2019-10-30T19:42:59Z","reason":"DesiredWithinRange","message":"the      desired count is within the acceptable range"}]'    autoscaling.alpha.kubernetes.io/current-metrics: '[{"type":"Resource","resource":{"name":"cpu","currentAverageUtilization":0,"currentAverageValue":"0"}}]'    kubectl.kubernetes.io/last-applied-configuration: |      {"apiVersion":"autoscaling/v1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"spec":{"maxReplicas":10,"minReplicas":1,"scaleTargetRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"nginx"},"targetCPUUtilizationPercentage":50}}  creationTimestamp: "2019-10-30T19:42:43Z"  name: nginx  namespace: default  resourceVersion: "220050"  selfLink: /apis/autoscaling/v1/namespaces/default/horizontalpodautoscalers/nginx  uid: 70d1067d-fb4d-11e9-8b2a-42010a8e013fspec:  maxReplicas: 10  minReplicas: 1  scaleTargetRef:    apiVersion: apps/v1    kind: Deployment    name: nginx  targetCPUUtilizationPercentage: 50status:  currentCPUUtilizationPercentage: 0  currentReplicas: 3  desiredReplicas: 3

Before following the remaining examples in this page, delete the HPA:

kubectldeletehpanginx

When you delete a Horizontal Pod Autoscaler, the number of replicas of the Deployment remain the same.A Deployment does not automatically revert back to its state before the Horizontal Pod Autoscaler wasapplied.

You can learn more aboutdeleting a Horizontal Pod Autoscaler.

Autoscaling based on load balancer traffic

Traffic-based autoscaling is a capability of GKE that integratestraffic utilization signals from load balancers to autoscale Pods.

Using traffic as an autoscaling signal might be helpful since traffic is aleading indicator of load that is complementary to CPU and memory. Built-inintegration with GKE ensures that the setup is easy and thatautoscaling reacts to traffic spikes quickly to meet demand.

Traffic-based autoscaling is enabled by theGateway controller and itsglobal traffic managementcapabilities. To learn more, seeTraffic-based autoscaling.

Autoscaling based on load balancer traffic is only available forGateway workloads.

Requirements

Traffic-based autoscaling has the following requirements:

Supported on GKE versions 1.31 and later.
Gateway API enabled in your GKE cluster.
Supported for traffic that goes through load balancers deployed using theGateway API and either thegke-l7-global-external-managed,gke-l7-regional-external-managed,gke-l7-rilb, or thegke-l7-gxlbGatewayClass.

Limitations

Traffic-based autoscaling has the following limitations:

Not supported by the multi-cluster GatewayClasses(gke-l7-global-external-managed-mc,gke-l7-regional-external-managed-mc,gke-l7-rilb-mc, andgke-l7-gxlb-mc).
Not supported for traffic using Services of typeLoadBalancer.
There must be a clear and isolated relationship between the componentsinvolved in traffic-based autoscaling. One Horizontal Pod Autoscaler must bededicated to scaling a single Deployment (or any scalable resource) exposed bya single Service.
After configuring the capacity of your Service using themaxRatePerEndpointfield, allow sufficient time (usually one minute, but potentially up to 15 minutesin large clusters) for the load balancer to beupdated with this change, before configuring the Horizontal Pod Autoscaler withtraffic-based metrics. This ensures your service won't temporarily experiencea situation where your cluster tries to autoscale based on metrics emitted bya load balancer still undergoing configuration.
If traffic-based autoscaling is used on a Service served by multiple loadbalancers (for example -- by both an Ingress and a Gateway, or by two Gateways), the HorizontalPod Autoscaler might consider the highest traffic value from individual load balancers tomake scaling decisions, rather than the sum of traffic values from all load balancers.

Deploy traffic-based autoscaling

The following exercise uses theHorizontalPodAutoscaler to autoscale thestore-autoscale Deployment based on the traffic it receives. AGateway accepts ingresstraffic from the internet for the Pods. The autoscaler compares traffic signalsfrom the Gateway with theper-Pod traffic capacitythat is configured on thestore-autoscale Service resource. By generatingtraffic to the Gateway, you influence the number of Pods deployed.

The following diagram demonstrates how traffic-based autoscaling works:

HorizontalPodAutoscaler scaling a Deployment based on traffic.

To deploy traffic-based autoscaling, perform the following steps:

For Standard clusters, confirm that the GatewayClasses are installedin your cluster. For Autopilot clusters, the GatewayClasses areinstalled by default.

kubectlgetgatewayclass

The output confirms that the GKE GatewayClass resources areready to use in your cluster:

NAME                               CONTROLLER                  ACCEPTED   AGEgke-l7-global-external-managed     networking.gke.io/gateway   True       16hgke-l7-regional-external-managed   networking.gke.io/gateway   True       16hgke-l7-gxlb                        networking.gke.io/gateway   True       16hgke-l7-rilb                        networking.gke.io/gateway   True       16h

If you don't see this output,enable the Gateway APIin your GKE cluster.

Deploy the sample application and Gateway load balancer to your cluster:
```
kubectlapply-fhttps://raw.githubusercontent.com/GoogleCloudPlatform/gke-networking-recipes/master/gateway/docs/store-autoscale.yaml
```
The sample application creates:
- A Deployment with 2 replicas.
- A Service with an associatedGCPBackendPolicy settingmaxRatePerEndpointset to10. To learn more about Gateway capabilities, seeGatewayClass capabilities.
- An external Gateway for accessing the application on the internet.To learn more about how to use Gateway load balancers, seeDeploying Gateways.
- An HTTPRoute that matches all traffic and sends it to thestore-autoscale Service.
TheService capacityis a critical element when using traffic-based autoscaling because itdetermines the amount of per-Pod traffic that triggers an autoscaling event.It is configured using amaxRatePerEndpoint field on aGCPBackendPolicyassociated with the Service, which defines the maximum traffic a Serviceshould receive in requests per second, per Pod. Service capacity is specificto your application.
For more information, seeDetermining your Service's capacity.
Save the following manifest ashpa.yaml:
```
apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:store-autoscalespec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:store-autoscale# Set the minimum and maximum number of replicas the Deployment can scale to.minReplicas:1maxReplicas:10# This section defines that scaling should be based on the fullness of load balancer# capacity, using the following configuration.metrics:-type:Objectobject:describedObject:kind:Servicename:store-autoscalemetric:# The name of the custom metric which measures how "full" a backend is# relative to its configured capacity.name:"autoscaling.googleapis.com|gclb-capacity-fullness"target:# The target average value for the metric. The autoscaler adjusts the number# of replicas to maintain an average capacity fullness of 70% across all Pods.averageValue:70type:AverageValue
```
Note: If you previously used theautoscaling.googleapis.com|gclb-capacity-utilizationmetric name, we recommend that you switch to theautoscaling.googleapis.com|gclb-capacity-fullnessmetric name instead.
This manifest describes aHorizontalPodAutoscaler with the followingproperties:
- minReplicas andmaxReplicas: sets the minimum and maximum number ofreplicas for this Deployment. In this configuration, the number of Pods canscale from 1 to 10 replicas.
- describedObject.name: store-autoscale: the reference to thestore-autoscale Service that defines the traffic capacity.
- scaleTargetRef.name: store-autoscale: the reference to thestore-autoscale Deployment that defines the resource that is scaled bythe Horizontal Pod Autoscaler.
- averageValue: 70: target average value of 70% capacity utilization. Thisgives the Horizontal Pod Autoscaler a growth margin so that the runningPods can process excess traffic while new Pods are being created.
Note: A Deployment or a Service cannot be referenced by more than oneHorizontal Pod Autoscaler. If this condition is not met, the Horizontal PodAutoscaler stops autoscaling and errors appear in the Horizontal PodAutoscaler events.

The Horizontal Pod Autoscaler results in the following traffic behavior:

The number of Pods is adjusted between 1 and 10 replicas to achieve70% of the max rate per endpoint. This results in 7 RPS per Pod whenmaxRatePerEndpoint=10.
At more than 7 RPS per pod, Pods are scaled up until they've reachedtheir maximum of 10 replicas or until the average traffic is 7 RPS per Pod.
If traffic is reduced, Pods scale down to a reasonable rate using theHorizontal Pod Autoscaler algorithm.

You can also deploy a traffic generatorto validate traffic-based autoscaling behavior.

At 30 RPS, the Deployment is scaled to 5 replicas so that each replica ideallyreceives 6 RPS of traffic, which would be 60% utilization per Pod. This is underthe 70% target utilization and so the Pods are scaled appropriately.Depending on traffic fluctuations, the number of autoscaled replicas might alsofluctuate. For a more detailed description of how the number of replicas iscomputed, seeAutoscaling behavior.

Autoscaling based on a custom or external metric

To create horizontal Pod autoscalers forcustom metrics and external metrics, seeOptimize Pod autoscaling based on metrics.

Autoscaling based on multiple metrics

This example creates a Horizontal Pod Autoscaler that autoscales based on CPU utilization and acustom metric namedpackets_per_second.

If you followed the previous example and still have a Horizontal Pod Autoscaler namednginx,delete it before following this example.

This example requiresapiVersion: autoscaling/v2. For more informationabout the available APIs, seeAPI versions forHorizontalPodAutoscaler objects.

Before you can autoscale based on a custom metric, you must create the custommetric and configure your workload to export the metric toCloud Monitoring. For this reason, thepackets_per_second metric in themanifest below is included for illustration, but commented out. See custom metricsandthe Monitoring documentation forcreating custom metrics.

Save this YAML manifest as a file namednginx-multiple.yaml:

apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:nginxspec:scaleTargetRef:apiVersion:apps/v1kind:Deploymentname:nginxminReplicas:1maxReplicas:10metrics:# The metrics to base the autoscaling on.-type:Resourceresource:name:cpu# Scale based on CPU utilization.target:type:UtilizationaverageUtilization:50# The HPA will scale the replicas to try and maintain an average# CPU utilization of 50% across all Pods.-type:Resourceresource:name:memory# Scale based on memory usage.target:type:AverageValueaverageValue:100Mi# The HPA will scale the replicas to try and maintain an average# memory usage of 100 Mebibytes (MiB) across all Pods.# Uncomment these lines if you create the custom packets_per_second metric and# configure your app to export the metric.# - type: Pods#   pods:#     metric:#       name: packets_per_second#     target:#       type: AverageValue#       averageValue: 100

Apply the YAML manifest:

kubectlapply-fnginx-multiple.yaml

When created, the Horizontal Pod Autoscaler monitors thenginx Deployment for average CPU utilization,average memory utilization, and (if you uncommented it) the custompackets_per_second metric. The Horizontal Pod Autoscaler autoscales the Deployment based on themetric whose value would create the larger autoscale event.

Configure the Performance HPA profile

The Performance HPA profile improves the reaction time of the Horizontal Pod Autoscaler,enabling it to quickly recalculate a large number ofHorizontalPodAutoscaler objects(up to 1,000 objects in minor versions 1.31-1.32 and 5,000 objects in version 1.33 or later).

This profile is automatically enabled on qualifying Autopilot clusterswith a control plane running GKE version 1.32 or later. ForStandard clusters, the profile is automatically enabled on qualifyingclusters with a control plane running GKE version 1.33 or later.

A Standard cluster is exempt from auto-enablement of the PerformanceHPA profile if it meets all of the following conditions:

The cluster is upgrading from an earlier version to version 1.33 or later.
The cluster has at least one node pool with any of the following machinetypes:e2-micro,e2-custom-micro,g1-small,f1-micro.
Node auto-provisioning is not enabled.

You can also enable the Performance HPA profile on existing clusters if theymeet the requirements.

Requirements

To enable the Performance HPA profile, verify that your Autopilot andStandard clusters meet the following requirements:

Your control plane is running GKE version 1.31 or later.
If your control plane is running GKE version 1.31, enablesystem metric collection.
TheAutoscaling APIis enabled in your cluster.
Allnode Service Accountshave theroles/autoscaling.metricsWriterrole assigned.
If you useVPC Service Controls,verify that theAutoscaling APIis included in your service perimeter.

Enable the Performance HPA profile

To enable the Performance HPA profile in your cluster, use the following command:

gcloudcontainerclustersupdateCLUSTER_NAME\--location=LOCATION\--project=PROJECT_ID\--hpa-profile=performance

Replace:

CLUSTER_NAME: The name of the cluster.
LOCATION: Compute zone or region (e.g. us-central1-a or us-central1) for the cluster.
PROJECT_ID: Your Google Cloud project ID.

Note: The Performance HPA profile enhances monitoring by increasing thegke-metrics-agent resourcerequests, and triggers a simultaneous restart of its Pods.This may cause temporary disruption on resource-constrained nodes due to Pod rescheduling.

Disable the Performance HPA profile

To disable Performance HPA profile in a cluster, use the following command:

gcloudcontainerclustersupdateCLUSTER_NAME\--location=LOCATION\--project=PROJECT_ID\--hpa-profile=none

Replace:

CLUSTER_NAME: The name of the cluster.
LOCATION: Compute zone or region (e.g. us-central1-a or us-central1) for the cluster.
PROJECT_ID: Your Google Cloud project ID.

Viewing details about a Horizontal Pod Autoscaler

To view a Horizontal Pod Autoscaler's configuration and statistics, use the following command:

kubectldescribehpaHPA_NAME

ReplaceHPA_NAME with the name of yourHorizontalPodAutoscaler object.

If the Horizontal Pod Autoscaler usesapiVersion: autoscaling/v2 and is based on multiplemetrics, thekubectl describe hpa command only shows the CPU metric. To seeall metrics, use the following command instead:

kubectldescribehpa.v2.autoscalingHPA_NAME

ReplaceHPA_NAME with the name of yourHorizontalPodAutoscaler object.

Each Horizontal Pod Autoscaler's current status is shown inConditions field, and autoscaling eventsare listed in theEvents field.

Note: If you've enabled the Performance HPA profile,Events: Reason is listed asHpaProfilePerformance.

The output is similar to the following:

Name:                                                  nginxNamespace:                                             defaultLabels:                                                <none>Annotations:                                           kubectl.kubernetes.io/last-applied-configuration:                                                         {"apiVersion":"autoscaling/v2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx","namespace":"default"},"s...CreationTimestamp:                                     Tue, 05 May 2020 20:07:11 +0000Reference:                                             Deployment/nginxMetrics:                                               ( current / target )  resource memory on pods:                             2220032 / 100Mi  resource cpu on pods  (as a percentage of request):  0% (0) / 50%Min replicas:                                          1Max replicas:                                          10Deployment pods:                                       1 current / 1 desiredConditions:  Type            Status  Reason              Message  ----            ------  ------              -------  AbleToScale     True    ReadyForNewScale    recommended size matches current size  ScalingActive   True    ValidMetricFound    the HPA was able to successfully calculate a replica count from memory resource  ScalingLimited  False   DesiredWithinRange  the desired count is within the acceptable rangeEvents:                                                <none>

Deleting a Horizontal Pod Autoscaler

You can delete a Horizontal Pod Autoscaler using the Google Cloud console or thekubectl delete command.

Console

To delete thenginx Horizontal Pod Autoscaler:

Go to theWorkloads page in the Google Cloud console.
Go to Workloads
Click the name of thenginx Deployment.
ClickActions> Autoscale.
ClickDelete.

`kubectl delete`

To delete thenginx Horizontal Pod Autoscaler, use the following command:

kubectldeletehpanginx

When you delete a Horizontal Pod Autoscaler, the Deployment or (or other deployment object) remainsat its existing scale, and does not revert back to the number of replicas inthe Deployment's original manifest. To manually scale the Deployment back tothree Pods, you can use thekubectl scale command:

kubectlscaledeploymentnginx--replicas=3

Cleaning up

Delete the Horizontal Pod Autoscaler, if you have not done so:
```
kubectldeletehpanginx
```
Delete thenginx Deployment:
```
kubectldeletedeploymentnginx
```
Optionally,delete the cluster.

Troubleshooting

For advice on troubleshooting, seeTroubleshoot horizontal Pod autoscaling.

What's next

Learn more aboutHorizontal Pod Autoscaling.
Learn more aboutVertical Pod Autoscaling.
Learn how tooptimize Pod autoscaling based on metrics.
Learn more aboutautoscaling Deployments with Custom Metrics.
Learn how toAssign CPU Resources to Containers and Pods.
Learn how toAssign Memory Resources to Containers and Pods.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.

Movatterモバイル変換

Configuring horizontal Pod autoscaling Stay organized with collections Save and categorize content based on your preferences.

Before you begin

API versions forHorizontalPodAutoscaler objects

Create the example Deployment

Autoscaling based on resources utilization

Console

kubectl apply

kubectl autoscale

Console

kubectl get

Autoscaling based on load balancer traffic

Requirements

Limitations

Deploy traffic-based autoscaling

Autoscaling based on a custom or external metric

Autoscaling based on multiple metrics

Configure the Performance HPA profile

Requirements

Enable the Performance HPA profile

Disable the Performance HPA profile

Viewing details about a Horizontal Pod Autoscaler

Deleting a Horizontal Pod Autoscaler

Console

kubectl delete

Cleaning up

Troubleshooting

What's next

Configuring horizontal Pod autoscaling

API versions for`HorizontalPodAutoscaler` objects

`kubectl apply`

`kubectl autoscale`

`kubectl get`

`kubectl delete`