Manually upgrading a cluster or node pool

This document explains how you can manually request an upgrade or downgrade forthe control plane or nodes of a Google Kubernetes Engine (GKE) cluster.GKE automatically upgrades the version of the control plane andnodes to ensure that the cluster receives new features, bug fixes, and securitypatches. But, as explained in this document, you can also manually perform theseupgrades yourself instead.

For more information about how automatic and manual cluster upgrades work, seeAbout GKE cluster upgrades. Youcan also control when auto-upgrades can and cannot occur by configuringmaintenance windows andexclusions.

You can manually upgrade the version as follows:

To upgrade a cluster, GKE updates the version that the controlplane and nodes run, in separate operations. Clusters are upgraded to either anewer minor version (for example, 1.33 to 1.34) or newer patch version (forexample, 1.33.4-gke.1350000 to 1.33.5-gke.1080000). A cluster's control planeand nodes don't necessarily run the same version at all times. For moreinformation about versions, seeGKE versioning andsupport.

For more information about how cluster upgrades work, including automatic andmanual upgrades, seeAbout GKE clusterupgrades.

New versions of GKE areannouncedregularly, and you can receive noticeabout the new versions available for each specific cluster withcluster notifications.To find specific auto-upgrade targets for clusters,get information about acluster's upgrades.

For more information about available versions, seeVersioning. For moreinformation about clusters, seeCluster architecture. For guidance onupgrading clusters, seeBest practices for upgradingclusters.

Before you begin

Before you start, make sure that you have performed the following tasks:

  • Enable the Google Kubernetes Engine API.
  • Enable Google Kubernetes Engine API
  • If you want to use the Google Cloud CLI for this task,install and theninitialize the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running thegcloud components update command. Earlier gcloud CLI versions might not support running the commands in this document.Note: For existing gcloud CLI installations, make sure to set thecompute/regionproperty. If you use primarily zonal clusters, set thecompute/zone instead. By setting a default location, you can avoid errors in the gcloud CLI like the following:One of [--zone, --region] must be supplied: Please specify location. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.

About upgrading

A cluster'scontrol plane andnodes areupgraded separately. The cluster's control plane and nodes don't necessarily runthe same version at all times.

Cluster control planes and nodes areupgraded on a regularbasis, regardless of whether your cluster is enrolled in areleasechannel or not.

Limitations

Alpha clusters cannot beupgraded.

Supported versions

Therelease notes announce when newversions become available and when earlier versions are no longer available. Atany time, you can list all supported cluster and node versions using thiscommand:

gcloudcontainerget-server-config\--location=CONTROL_PLANE_LOCATION

ReplaceCONTROL_PLANE_LOCATION with the location(region or zone) for the control plane, such asus-central1 orus-central1-a.

If your cluster is enrolled in a release channel, you can upgrade to a patchversion in a different release channel with the same minor version as yourcontrol plane. For example, you can upgrade your cluster from version1.33.4-gke.1350000 in the Regular channel to 1.33.5-gke.1162000 in the Rapidchannel. For more information, refer toRunning patch versions from a newerchannel.All Autopilot clusters are enrolled in release channels.

About downgrading

You can downgrade the version of your cluster to an earlier version in certainscenarios:

Other than the scenarios described in the previous points, you can't downgradea cluster. You can't downgrade a cluster control plane to a previous minorversion, including after aone-step control plane minor upgrade.For example, if your control plane runs GKEversion 1.34, you cannot downgrade to 1.33. If you attempt to do this, thefollowing error message appears:

ERROR: (gcloud.container.clusters.upgrade) ResponseError: code=400,message=Master cannot be upgraded to "1.33.4-gke.1350000": specified version isnot newer than the current version.

We recommend that youtest and qualify minor versionupgradeswith clusters in a testing environment when a new minor version becomesavailable but before the version becomes auto-upgrade target for your cluster.This is especially recommended if your cluster might be affected by significantchanges in the next minor version, such asdeprecated APIs orfeatures being removed. For moreinformation about version availability, seeWhat versions are available in achannel.

Upgrade the cluster's control plane

Note: You cannot upgrade your cluster's control plane more thanone minorversion at a time. For example, you can upgrade a control plane from version1.33 to 1.34, but not directly from 1.32 to 1.34. For more information, refer toCan I skip versions during a clusterupgrade?.

GKE upgrades cluster's control planes and nodes automatically. Tomanage how GKE upgrades your clusters, seeControl clusterupgrades.

WithAutopilotclusters andregionalStandardclusters,the control plane remains available during control plane upgrades. However, whenyou initiate a control plane upgrade forzonalclusters, youcan't modify the cluster's configuration until the control plane is accessibleagain in a few minutes. Control plane upgrades don't affect the availability ofthe worker nodes that your workloads run on because they remain available duringcontrol plane upgrades.

As part of managing the versions of your cluster, you can initiate a manualupgrade any time after a new version becomes available, using one of thefollowing methods:

  • One-step upgrade: upgrade your control plane directly to alater minor version or patch version as quickly as possible. You can use thisapproach if you've already validated your cluster and workload performance onthe new minor version.
  • Two-step control plane minor upgrade with rollbacksafety(Preview):upgrade your control plane to a later minor version using a two-step processwhere you can validate the new minor version for a period of soak time, androll back if needed. This upgrade method is only available for upgrading to1.33 or later, for manual minor control plane upgrades.

Manually upgrade the control plane with a one-step upgrade

You can manually upgrade your Autopilot or Standard controlplane using the Google Cloud console or the Google Cloud CLI.

Console

To manually update your cluster control plane, perform the following steps:

  1. Go to theGoogle Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. Click the name of the cluster.

  3. UnderCluster basics, clickUpgrade Available next toVersion.

  4. Select the new version, then clickSave Changes.

gcloud

To see the available versions for your cluster's control plane, run thefollowing command:

gcloudcontainerget-server-config\--location=CONTROL_PLANE_LOCATION

To upgrade to the default cluster version, run the following command:

gcloudcontainerclustersupgradeCLUSTER_NAME\--master\--location=CONTROL_PLANE_LOCATION

To upgrade to a specific version that is not the default, specify the--cluster-version flag as in the following command:

gcloudcontainerclustersupgradeCLUSTER_NAME\--master\--location=CONTROL_PLANE_LOCATION\--cluster-version=VERSION

ReplaceVERSION with the version that you want toupgrade your cluster to. You can use a specific version, such as1.32.9-gke.1072000 or you can use a version alias, likelatest. For moreinformation, seeSpecifying clusterversion.

Note: If your cluster is enrolled in a release channel, theVERSION must be a valid minor version for therelease channel, or a valid patch version in a newer release channel. AllAutopilot clusters are enrolled in a release channel.

After upgrading a Standard control plane, you canupgrade its nodes. By default, Standard nodescreated using the Google Cloud console have auto-upgrade enabled, so this happensautomatically. Autopilot always upgrades nodes automatically.

Two-step control plane minor upgrade with rollback safety

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

You can manually upgrade the control plane of your GKEAutopilot or Standard cluster to the next minor version with atwo-step upgrade. In this two step process, you can test how your clusterperforms with the new minor version, known as thebinary version, while using the features and APIs from the previous minorversion, known as theemulated version. During this soak time, where thecontrol plane runs in what's known asemulated mode, you can roll back to the previous minor version, if necessary.For more information about how Kubernetes allows for this type of upgrade, seeCompatibility Version For Kubernetes Control PlaneComponents.

Two-step upgrades work in the following way:

  1. Binary upgrade: GKE upgrades the control plane binary tothe new minor version, but emulates the previous minor version:

    • Emulates previous version: the cluster runs the new binary, butcontinues to emulate the behavior of the previous minor version API. Forexample, you can call APIs that are removed in the new minor version,but are still available in the previous minor version.
    • Test new binary: you can test the new binaries for regressions,fixes, and performance changes before you make accessible the Kubernetesfeatures available with the new minor version. Monitor applicationmetrics, logs, Pod statuses, error rates, and latency.
    • Soak the changes: wait for six hours to seven days to give yourselftime to test and monitor. After this time, GKE performsthe emulated version upgrade.
    • Roll back or complete the upgrade: you can roll back, if needed. Or,you can advance to the next stage if you're confident with the new minorversion, don't want to wait for the soak time to complete, and are readyto start using the new features and API changes.
  2. Emulated version upgrade: GKE updates the emulatedversion to match the new binary version.

    • Enables new features: all new features and API changes of the newminor version are enabled.
    • No rollback: after this step occurs, you can't roll back to theoriginal minor version. The upgrade is complete.

During this operation, the following limitations apply:

  • You can't initiate a one-step control plane minor upgrade.
  • You can't create or upgrade the nodes to a version that is later than theemulated version.
  • GKE doesn't perform any type of automatic upgrades to thecontrol plane or nodes.

Start a two-step upgrade

Start a two-step upgrade by running the following command:

gcloudbetacontainerclustersupgradeCLUSTER_NAME\--location=CONTROL_PLANE_LOCATION\--cluster-versionVERSION\--control-plane-soak-durationSOAK_DURATION\--master

Replace the following:

  • CLUSTER_NAME: the name of the cluster.
  • CONTROL_PLANE_LOCATION: the location (region orzone) for the control plane, such asus-central1 orus-central1-a.
  • VERSION: a specific patch of the next minor version.For example, if your cluster runs 1.33,1.34.1-gke.1829001.
  • SOAK_DURATION: the time to waitin the rollback-safe stage. You can set this value for a minimum of 6 hours toa maximum of 7 days using theAbsolute duration formats as explained inthe reference forgcloud topic datetimes.For example, use2d1h for a soak time of two days and one hour.

Test the new binary during a two-step upgrade

During the soak time, validate that your cluster—with the control plane runningthe new binary—and the workloads perform as expected. You can do one of thefollowing steps, depending on whether you are able to verify that the workloadsare compatible with the new binary:

  • Roll back: if you observe an issue with yourworkloads running on the new binary, you can roll back to the previous minorversion.
  • Complete the upgrade: if you have verified thatyour workloads run without issues on the new binary, you can complete theupgrade if you want to start using the features and APIs of the new version.
  • Wait: you can also wait for the soak time to elapse. After,GKE performs the emulated version upgrade, where ittransitions to using the features and APIs of the new minor version.
Observe the in-progress upgrade

To get information about an in-progress upgrade, use one of the followingresources:

Roll back a two-step upgrade after the binary version upgrade

During a two-step upgrade, after the binary version upgrade is the soakingperiod. During this period, you can roll back to the previous minor version, ifnecessary. You can't roll back after GKE performs theemulatedversion upgrade.

After the rollback operation completes, your control plane runs the previousminor version as it did before you initiated the two-step upgrade.

Do the following steps to roll back, if possible:

  1. Check that you can still roll the control plane back to the previous minorversion by running the gcloud CLI command atGet upgradesinformation at the clusterlevel.Determine whether you can or can't roll back by the output of the command:

    • You can roll back if there is arollbackSafeUpgradeStatus section inthe output. In that section, save thepreviousVersion for theVERSION variable in the next step. Proceed tothe next step.
    • You can't roll back if there is norollbackSafeUpgradeStatus section.This indicates that GKE already performed the emulatedversion upgrade. You can't perform the next step.
  2. If the previous step determined that rollback is possible, roll back to theprevious version:

    gcloudcontainerclustersupgradeCLUSTER_NAME\--location=CONTROL_PLANE_LOCATION\--cluster-versionVERSION--master

    TheVERSION must be the exact patch versionpreviously used. You saved this version in the previous step.

After you run this command and downgrade to the previous version, you candetermine why your workload didn't run correctly on the new binary. If needed,you can reach out to Cloud Customer Care, providing relevant logs, error messages,and details about the validation failure that you encountered. For moreinformation, seeGet support.

After you've resolved the issue, you can manually upgrade again to the new minorversion.

Complete the two-step upgrade

During the soaking period, if you've verified that the workloads run successfullywith the new binary, you can skip the rest of the soak time:

gcloudbetacontainerclustersclusterscomplete-control-plane-upgradeCLUSTER_NAME\--location=CONTROL_PLANE_LOCATION

After you run this command, you can no longer downgrade to the previous minorversion.

Downgrade the control plane to an earlier patch version

Note: Before attempting to downgrade a cluster, ensure that you're alreadyfamiliar with thelimitations.
  1. Set amaintenanceexclusionbefore downgrading to prevent GKE from automatically upgradingthe control plane after you downgrade it.
  2. Downgrade the cluster control plane to an earlier patch version:

    gcloudcontainerclustersupgradeCLUSTER_NAME\--master\--location=CONTROL_PLANE_LOCATION\--cluster-version=VERSION
Note: If your cluster is subscribed to a release channel, theVERSION must be anavailable patchversionfor the release channel. All Autopilot clusters are enrolled in arelease channel.

Disabling cluster auto-upgrades

Infrastructure security is high priority for GKE, and as suchcontrol planes are upgradedon a regular basis, and cannot be disabled. However, you can applymaintenance windows andexclusionsto temporarily suspend upgrades for control planes and nodes.

Although it isnot recommended, you candisable nodeauto-upgrade forStandard node pools.

Check recent control plane upgrade history

For a snapshot of a cluster's recent auto-upgrade history,get informationabout a cluster'supgrades.

Alternatively, you can list recent operations to see when the control plane wasupgraded:

gcloudcontaineroperationslist--filter="TYPE:UPGRADE_MASTER AND TARGET:CLUSTER_NAME"\--location=CONTROL_PLANE_LOCATION

Upgrade node pools

By default, Standard node pools haveauto-upgradeenabled, and all Autopilot-managed node pools in Standardclusters always have auto-upgrade enabled. Node auto-upgrades ensure that yourcluster's control plane and node version remain in sync and in compliance withtheKubernetes version skewpolicy, which ensures thatcontrol planes are compatible with nodes up to two minor versions earlier thanthe control plane. For example, Kubernetes 1.34 control planes are compatiblewith Kubernetes 1.32 nodes.

Best practice:

Avoid disablingnode auto-upgrades with Standard node pools so that your cluster benefits from the upgrades listed in the preceding paragraph.

With GKE Standard node pool upgrades, you can choosebetween three configurable upgrade strategies, includingsurgeupgrades,blue-greenupgrades,andautoscaled blue-greenupgrades(Preview).Autopilot-managed node pools in Standard clusters always usesurge upgrades.

For Standard node pools,choose astrategy anduse the parameters to tune thestrategy to bestfit your cluster environment's needs.

How node upgrades work

While a node is being upgraded, GKE stops scheduling new Podsonto it, and attempts to schedule its running Pods onto other nodes. This issimilar to other events that re-create the node, such as enabling or disabling afeature on the node pool.

During automatic or manual node upgrades,PodDisruptionBudgets(PDBs) andPod termination graceperiodare respected for a maximum of 1 hour. If Pods running on the node can't bescheduled onto new nodes after one hour, GKE initiates theupgrade anyway. This behavior applies even if you configure your PDBs to alwayshave all of your replicas available by setting themaxUnavailable field to0or0% or by setting theminAvailable field to100% or to the number ofreplicas. In all of these scenarios, GKE deletes the Pods afterone hour so that the node deletion can happen.

Best practice:

If a workload running in a Standard node pool requires more flexibility with graceful termination, useblue-green upgrades which provide settings for additional soak time to extend PDB checks beyond the one hour default.

To learn more about what to expect during node termination in general,see the topic aboutPods.

The upgrade is only complete when all nodes have been recreatedand the cluster is in the new state. When a newly-upgraded node registerswith the control plane, GKE marks the node as schedulable.

New node instances run the new Kubernetes version as well as the following:

For a node pool upgrade to be considered complete, all nodes in the node poolmust be recreated. If an upgrade started but then didn't complete and is in apartially upgraded state, the node pool version might not reflect the version ofall of the nodes. To learn more, seeSome node versions don't match the nodepool version after an incomplete node poolupgrade.To determine that the node pool upgrade finished,check the node pool upgradestatus. If theupgrade operation is beyond the retention period, then check that eachindividual node version matches the node pool version.

Save your data to persistent disks before upgrading

Before upgrading a node pool, you must ensure that any data you need to keep isstored in a Pod by usingpersistent volumes, which usepersistent disks.Persistent disks are unmounted, rather than erased, during upgrades, and theirdata is transferred between Pods.

The following restrictions pertain to persistent disks:

  • The nodes on which Pods are running must be Compute Engine VMs.
  • Those VMs need to be in the same Compute Engine project and zone as thepersistent disk.

To learn how to add a persistent disk to an existing node instance, seeAdding or resizing zonal persistentdisks in theCompute Engine documentation.

Manually upgrade a node pool

You can manually upgrade the version of a Standard node pool orAutopilot-managed node poolin a Standard cluster. You canmatch the version of the control plane or, use a previous version that is stillavailable and is compatible with the control plane. You can manually upgrademultiple node pools in parallel, whereas GKE automaticallyupgrades only one node pool at a time.

When you manually upgrade a node pool, GKE removes anylabelsyou added to individual nodes usingkubectl.To avoid this,apply labels to nodepoolsinstead.

Before you manually upgrade your node pool, consider the following conditions:

  • Upgrading a node pool may disrupt workloads running in that node pool. Toavoid this, you can create a new node pool with the required version andmigrate the workload. After migration, you can delete the old node pool.
  • If you upgrade a node pool with an Ingress in an errored state, the instancegroup does not sync. To work around this issue, first check the status usingthekubectl get ing command. If the instance group is not synced, you canwork around the problem by re-applying the manifest used to create theingress.

You can manually upgrade your node pools to a version compatible with thecontrol plane:

  • For Standard node pools, you can use the Google Cloud console or theGoogle Cloud CLI.
  • ForAutopilot-managed node pools,you can only use the Google Cloud CLI.

Console

To upgrade a Standard node pool using the Google Cloud console,perform the following steps:

  1. Go to theGoogle Kubernetes Engine page in Google Cloud console.

    Go to Google Kubernetes Engine

  2. Click the name of the cluster.

  3. On theCluster details page, click theNodes tab.

  4. In theNode Pools section, click the name of the node pool that youwant to upgrade.

  5. ClickEdit.

  6. ClickChange underNode version.

  7. Select the required version from theNode version drop-down list,then clickChange.

It may take several minutes for the node version to change.

gcloud

The following variables are used in the commands in this section:

  • CLUSTER_NAME: the name of the cluster of the nodepool to be upgraded.
  • NODE_POOL_NAME: the name of the node pool to beupgraded.
  • CONTROL_PLANE_LOCATION: the location (region orzone) for the control plane, such asus-central1 orus-central1-a.
  • VERSION: the Kubernetes version to whichthe nodes are upgraded. For example,--cluster-version=1.34.1-gke.1293000 orcluster-version=latest.

Upgrade a node pool:

gcloudcontainerclustersupgradeCLUSTER_NAME\--node-pool=NODE_POOL_NAME\--location=CONTROL_PLANE_LOCATION

To specify a different version of GKE on nodes, use theoptional--cluster-version flag:

gcloudcontainerclustersupgradeCLUSTER_NAME\--node-pool=NODE_POOL_NAME\--location=CONTROL_PLANE_LOCATION\--cluster-versionVERSION

For more information about specifying versions, seeVersioning.

For more information, refer to thegcloud container clusters upgradedocumentation.

Downgrade node pools

You can downgrade a node pool, for example, to mitigate anunsuccessful node pool upgrade. Review thelimitationsbefore downgrading a node pool.

Best practice:

Use theblue-green node upgrade strategy if you need tooptimize for risk mitigation for node pool upgrades impacting your workloads.With this strategy, you canroll backan in-progress upgrade to the original nodes if the upgrade is unsuccessful.

  1. Set amaintenance exclusionfor the cluster to prevent the node pool from being automatically upgraded byGKE after being downgraded.
  2. To downgrade a node pool, specify an earlier version while following theinstructions toManually upgrade a node pool.

Change surge upgrade parameters

For more information about changing surge upgrade parameters, seeConfiguresurgeupgrades.

Check node pool upgrade status

You can check the status of an upgrade usinggcloud container operations.

View a list of every running and completed operation in the cluster fromthe last 12 days if there's fewer than 5,000 operations, or the last 5,000operations:

gcloudcontaineroperationslist\--location=CONTROL_PLANE_LOCATION

Each operation is assigned anoperation ID and an operation type as well asstart and end times, target cluster, and status. The list appears similar tothe following example:

NAME                              TYPE                ZONE           TARGET              STATUS_MESSAGE  STATUS  START_TIME                      END_TIMEoperation-1505407677851-8039e369  CREATE_CLUSTER      us-west1-a     my-cluster                          DONE    20xx-xx-xxT16:47:57.851933021Z  20xx-xx-xxT16:50:52.898305883Zoperation-1505500805136-e7c64af4  UPGRADE_CLUSTER     us-west1-a     my-cluster                          DONE    20xx-xx-xxT18:40:05.136739989Z  20xx-xx-xxT18:41:09.321483832Zoperation-1505500913918-5802c989  DELETE_CLUSTER      us-west1-a     my-cluster                          DONE    20xx-xx-xxT18:41:53.918825764Z  20xx-xx-xxT18:43:48.639506814Z

To get more information about a specific operation, specify the operation ID asshown in the following command:

gcloudcontaineroperationsdescribeOPERATION_ID\--location=CONTROL_PLANE_LOCATION

For example:

gcloud container operations describe operation-1507325726639-981f0ed6endTime: '20xx-xx-xxT21:40:05.324124385Z'name: operation-1507325726639-981f0ed6operationType: UPGRADE_CLUSTERselfLink: https://container.googleapis.com/v1/projects/.../kubernetes-engine/docs/zones/us-central1-a/operations/operation-1507325726639-981f0ed6startTime: '20xx-xx-xxT21:35:26.639453776Z'status: DONEtargetLink: https://container.googleapis.com/v1/projects/.../kubernetes-engine/docs/zones/us-central1-a/clusters/...zone: us-central1-a

If the upgrade was cancelled or failed and is partially completed,you canresume orroll back the upgrade.

Check node pool upgrade settings

You can see details on the node upgrade strategy being used for your node poolsusing thegcloud container node-poolsdescribe command. Forblue-green upgrades, the command also returns thecurrentphaseof the upgrade.

Run the following command:

gcloudcontainernode-poolsdescribeNODE_POOL_NAME\--cluster=CLUSTER_NAME\--location=CONTROL_PLANE_LOCATION

Replace the following:

  • NODE_POOL_NAME: the name of the node pool to describe.
  • CLUSTER_NAME: the name of the cluster of the node poolto describe.
  • CONTROL_PLANE_LOCATION: the location (region or zone)for the control plane, such asus-central1 orus-central1-a.

This command will output the current upgrade settings. The following exampleshows the output if you are using the blue-green upgrade strategy.

upgradeSettings:  blueGreenSettings:    nodePoolSoakDuration: 1800s    standardRolloutPolicy:      batchNodeCount: 1      batchSoakDuration: 10s  strategy: BLUE_GREEN

If you are using the blue-green upgrade strategy, the output also includesdetails about the blue-green upgrade settings and its current intermediate phase.The following example shows what this might look like:

updateInfo:  blueGreenInfo:    blueInstanceGroupUrls:    - https://www.googleapis.com/compute/v1/projects/{PROJECT_ID}/zones/{LOCATION}/instanceGroupManagers/{BLUE_INSTANCE_GROUP_NAME}    bluePoolDeletionStartTime: {BLUE_POOL_DELETION_TIME}    greenInstanceGroupUrls:    - https://www.googleapis.com/compute/v1/projects/{PROJECT_ID}/zones/{LOCATION}/instanceGroupManagers/{GREEN_INSTANCE_GROUP_NAME}     greenPoolVersion: {GREEN_POOL_VERSION}    phase: DRAINING_BLUE_POOL

Cancel a node pool upgrade

You can cancel an upgrade at any time. To learn more about what happens when youcancel a surge upgrade, seeCancel a surge upgrade. To learn more about what happens when you cancel a blue-greenupgrade, seeCancel a blue-green upgrade.

  1. Get the upgrade's operation ID:

    gcloudcontaineroperationslist\--location=CONTROL_PLANE_LOCATION
  2. Cancel the upgrade:

    gcloudcontaineroperationscancelOPERATION_ID\--location=CONTROL_PLANE_LOCATION

Refer to thegcloud container operations canceldocumentation.

Resume a node pool upgrade

You can resume an upgrade bymanually initiating the upgradeagain, specifying the target version from the original upgrade.

If, for example, an upgrade failed, or if you paused an ongoing upgrade, youcould resume the canceled upgrade by starting the same upgrade again on the nodepool, specifying the target version from the initial upgrade operation.

To learn more about what happens when you resume an upgrade, seeResume a surge upgradeandblue-green upgrade.

To resume an upgrade, use the following command:

gcloudcontainerclustersupgradeCLUSTER_NAME\--node-pool=NODE_POOL_NAME\--location=CONTROL_PLANE_LOCATION\--cluster-versionVERSION

Replace the following:

  • NODE_POOL_NAME: the name of the node pool for whichyou want to resume the node pool upgrade.
  • CLUSTER_NAME: the name of the cluster of the node poolfor which you want to resume the upgrade.
  • CONTROL_PLANE_LOCATION: the location (region or zone)for the control plane, such asus-central1 orus-central1-a.
  • VERSION: the target version of the canceled nodepool upgrade.

For more information, refer to thegcloud container clusters upgradedocumentation.

Roll back a node pool upgrade

You can roll back a node pool to downgrade the upgraded nodes to their originalstate from before the node pool upgrade started.

Use therollback command if an in-progress upgrade wascancelled,the upgrade failed, or the upgrade is incomplete due to amaintenance windowtiming out. Alternatively, if you want to specify the version, follow theinstructions todowngradethe node pool.

Note: You cannot roll back node pools once they have been successfully upgraded.You mustdowngradethe node pool if you need the nodes to be on the previous version.

To learn more about what happens when you roll back a node pool upgrade, seeRoll back a surge upgradeorRoll back a blue-green upgrade.

To roll back an upgrade, run the following command:

gcloudcontainernode-poolsrollbackNODE_POOL_NAME\--clusterCLUSTER_NAME\--location=CONTROL_PLANE_LOCATION

Replace the following:

  • NODE_POOL_NAME: the name of the node pool for which toto roll back the node pool upgrade.
  • CLUSTER_NAME: the name of the cluster of the node poolfor which to roll back the upgrade.
  • CONTROL_PLANE_LOCATION: the location (region or zone)for the control plane, such asus-central1 orus-central1-a.

Refer to thegcloud container node-pools rollbackdocumentation.

Complete a node pool upgrade

Warning: Using thecomplete-upgrade command is only possible withblue-green upgrades.

If you are using the blue-green upgrade strategy, you can complete a nodepool upgrade during theSoak phase,skipping the rest of the soak time.

To learn how completing a node pool upgrade works, seeComplete a node pool upgrade.

To complete an upgrade when using the blue-green upgradestrategy, run the following command:

gcloudcontainernode-poolscomplete-upgradeNODE_POOL_NAME\--clusterCLUSTER_NAME\--location=CONTROL_PLANE_LOCATION

Replace the following:

  • NODE_POOL_NAME: the name of the node pool for whichyou want to complete the upgrade.
  • CLUSTER_NAME: the name of the cluster of the nodepool for which you want to complete the upgrade.
  • CONTROL_PLANE_LOCATION: the location (region or zone)for the control plane, such asus-central1 orus-central1-a.

Refer to thegcloud container node-pools complete-upgradedocumentation.

Known issues

If you havePodDisruptionBudget objects configured that are unable toallow any additional disruptions, node upgrades might fail to upgrade to thecontrol plane version after repeated attempts. To prevent this failure, werecommend that you scale up theDeployment orHorizontalPodAutoscaler toallow the node to drain while still respecting thePodDisruptionBudgetconfiguration.

To see allPodDisruptionBudget objects that do not allow any disruptions:

kubectl get poddisruptionbudget --all-namespaces -o jsonpath='{range .items[?(@.status.disruptionsAllowed==0)]}{.metadata.name}/{.metadata.namespace}{"\n"}{end}'

Although automatic upgrades might encounter the issue, the automatic upgradeprocess forces the nodes to upgrade. However, the upgrade takes an extra hourfor every node in theistio-system namespace that violates thePodDisruptionBudget.

Troubleshooting

For information about troubleshooting, seeTroubleshoot cluster upgrades.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.