Run full-stack workloads at scale on GKE

This tutorial provides instructions for working with Helm version 3.9.3,PostgreSQL version 10.0.1, and the Locust load testing tool. The instructionsmight not represent newer versions of the app. For more information, refer tothe app-specific documentation:

This tutorial shows you how to run a web application that is backed by ahighly-available relational database at scale in Google Kubernetes Engine (GKE).

The sample application used in this tutorial is Bank of Anthos, an HTTP-basedweb application that simulates a bank's payment processing network. Bank ofAnthos uses multiple services to function. This tutorial focuses on the websitefrontend and the relational PostgreSQL databases that backs the Bank of Anthosservices. To learn more about Bank of Anthos, including its architecture and theservices it deploys, refer toBank of Anthos on GitHub.

Objectives

  • Create and configure a GKE cluster.
  • Deploy a sample web application and a highly-available PostgreSQL database.
  • Configure autoscaling of the web application and the database.
  • Simulate spikes in traffic using a load generator.
  • Observe how the services scale up and down.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use thepricing calculator.

New Google Cloud users might be eligible for afree trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, seeClean up.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. Install the Google Cloud CLI.

  3. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  4. Toinitialize the gcloud CLI, run the following command:

    gcloudinit
  5. Create or select a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.
    • Create a Google Cloud project:

      gcloud projects createPROJECT_ID

      ReplacePROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set projectPROJECT_ID

      ReplacePROJECT_ID with your Google Cloud project name.

  6. Verify that billing is enabled for your Google Cloud project.

  7. Enable the GKE API:

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    gcloudservicesenablecontainer.googleapis.com
  8. Install the Google Cloud CLI.

  9. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  10. Toinitialize the gcloud CLI, run the following command:

    gcloudinit
  11. Create or select a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.
    • Create a Google Cloud project:

      gcloud projects createPROJECT_ID

      ReplacePROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set projectPROJECT_ID

      ReplacePROJECT_ID with your Google Cloud project name.

  12. Verify that billing is enabled for your Google Cloud project.

  13. Enable the GKE API:

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    gcloudservicesenablecontainer.googleapis.com
  14. Install theHelm CLI.

Prepare the environment

  1. Clone the sample repository used in this tutorial:

    gitclonehttps://github.com/GoogleCloudPlatform/bank-of-anthos.gitcdbank-of-anthos/
  2. Set environment variables:

    PROJECT_ID=PROJECT_IDGSA_NAME=bank-of-anthosGSA_EMAIL=bank-of-anthos@${PROJECT_ID}.iam.gserviceaccount.comKSA_NAME=default

    ReplacePROJECT_ID with your Google Cloud projectID.

Set up the cluster and service accounts

  1. Create a cluster:

    gcloudcontainerclusterscreate-autobank-of-anthos--location=us-central1

    The cluster might take up to five minutes to start.

  2. Create an IAM service account:

    gcloudiamservice-accountscreatebank-of-anthos
  3. Grant access to the IAM service account:

    gcloudprojectsadd-iam-policy-bindingPROJECT_ID\--roleroles/cloudtrace.agent\--member"serviceAccount:bank-of-anthos@PROJECT_ID.iam.gserviceaccount.com"gcloudprojectsadd-iam-policy-bindingPROJECT_ID\--roleroles/monitoring.metricWriter\--member"serviceAccount:bank-of-anthos@PROJECT_ID.iam.gserviceaccount.com"gcloudiamservice-accountsadd-iam-policy-binding"bank-of-anthos@PROJECT_ID.iam.gserviceaccount.com"\--roleroles/iam.workloadIdentityUser\--member"serviceAccount:PROJECT_ID.svc.id.goog[default/default]"

    This step grants the following access:

    • roles/cloudtrace.agent: Write trace data such as latency informationto Trace.
    • roles/monitoring.metricWriter: Write metrics to Cloud Monitoring.
    • roles/iam.workloadIdentityUser: Allow a Kubernetes service account touse Workload Identity Federation for GKE to act as the IAM serviceaccount.
  4. Configure thedefault Kubernetes service account in thedefaultnamespace to act as the IAM service account that you created:

    kubectlannotateserviceaccountdefault\iam.gke.io/gcp-service-account=bank-of-anthos@PROJECT_ID.iam.gserviceaccount.com

    This allows Pods that use thedefault Kubernetes service account in thedefault namespace to access the same Google Cloud resources as theIAM service account.

Deploy Bank of Anthos and PostgreSQL

In this section, you install Bank of Anthos and a PostgreSQL database inhighly-available (HA) mode, which lets you autoscale replicas of the databaseserver. If you want to view the scripts, Helm chart, and Kubernetes manifestsused in this section, check theBank of Anthos repository on GitHub.

  1. Deploy the database schema and a data definition language (DDL) script:

    kubectlcreateconfigmapinitdb\--from-file=src/accounts/accounts-db/initdb/0-accounts-schema.sql\--from-file=src/accounts/accounts-db/initdb/1-load-testdata.sql\--from-file=src/ledger/ledger-db/initdb/0_init_tables.sql\--from-file=src/ledger/ledger-db/initdb/1_create_transactions.sh
  2. Install PostgreSQL using the sample Helm chart:

    helmrepoaddbitnamihttps://charts.bitnami.com/bitnamihelminstallaccounts-dbbitnami/postgresql-ha\--version10.0.1\--valuesextras/postgres-hpa/helm-postgres-ha/values.yaml\--set="postgresql.initdbScriptsCM=initdb"\--set="postgresql.replicaCount=1"\--wait

    This command creates a PostgreSQL cluster with a starting replica count of 1.Later in this tutorial, you'll scale the cluster based on incomingconnections. This operation might take ten minutes or more to complete.

  3. Deploy Bank of Anthos:

    kubectlapply-fextras/jwt/jwt-secret.yamlkubectlapply-fextras/postgres-hpa/kubernetes-manifests

    This operation might take a few minutes to complete.

Checkpoint: Validate your setup

  1. Check that all Bank of Anthos Pods are running:

    kubectlgetpods

    The output is similar to the following:

    NAME                                  READY   STATUSaccounts-db-pgpool-57ffc9d685-c7xs8   3/3     Runningaccounts-db-postgresql-0              1/1     Runningbalancereader-57b59769f8-xvp5k        1/1     Runningcontacts-54f59bb669-mgsqc             1/1     Runningfrontend-6f7fdc5b65-h48rs             1/1     Runningledgerwriter-cd74db4cd-jdqql          1/1     Runningpgpool-operator-5f678457cd-cwbhs      1/1     Runningtransactionhistory-5b9b56b5c6-sz9qz   1/1     Runninguserservice-f45b46b49-fj7vm           1/1     Running
  2. Check that you can access the website frontend:

    1. Get the external IP address of thefrontend service:

      kubectlgetingressfrontend

      The output is similar to the following:

      NAME       CLASS    HOSTS   ADDRESS         PORTS   AGEfrontend   <none>   *       203.0.113.9     80      12m
    2. In a browser, go to the external IP address. The Bank of Anthos sign inpage displays. If you're curious, explore the application.

      If you get a 404 error, wait a few minutes for the microservices toprovision and try again.

Autoscale the web app and PostgreSQL database

GKE Autopilot autoscales the cluster compute resourcesbased on the number of workloads in the cluster. To automatically scale thenumber of Pods in the cluster based on resource metrics, you must implementKuberneteshorizontal Pod autoscaling.You can use the built-in Kubernetes CPU and memory metrics or you can use custommetrics such as HTTP requests per second or the quantity of SELECT statements,taken from Cloud Monitoring.

In this section, you do the following:

  1. Configure horizontal Pod autoscaling for the Bank of Anthos microservicesusing both built-in metrics and custom metrics.
  2. Simulate load to the Bank of Anthos application to trigger autoscalingevents.
  3. Observe how the number of Pods and the nodes in your cluster automaticallyscale up and down in response to your load.

Set up custom metrics collection

To read custom metrics from Monitoring, you must deploy theCustom Metrics - Stackdriver Adapter adapter in your cluster.

  1. Deploy the adapter:

    kubectlapply-fhttps://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml
  2. Configure the adapter to use Workload Identity Federation for GKE to get metrics:

    1. Configure the IAM service account:

      gcloudprojectsadd-iam-policy-bindingPROJECT_ID\--member"serviceAccount:bank-of-anthos@PROJECT_ID.iam.gserviceaccount.com"\--roleroles/monitoring.viewergcloudiamservice-accountsadd-iam-policy-bindingbank-of-anthos@PROJECT_ID.iam.gserviceaccount.com\--roleroles/iam.workloadIdentityUser\--member"serviceAccount:PROJECT_ID.svc.id.goog[custom-metrics/custom-metrics-stackdriver-adapter]"
    2. Annotate the Kubernetes service account that the adapter uses:

      kubectlannotateserviceaccountcustom-metrics-stackdriver-adapter\--namespace=custom-metrics\iam.gke.io/gcp-service-account=bank-of-anthos@PROJECT_ID.iam.gserviceaccount.com
    3. Restart the adapter Deployment to propagate the changes:

      kubectlrolloutrestartdeploymentcustom-metrics-stackdriver-adapter\--namespace=custom-metrics

Configure autoscaling for the database

When youdeployed Bank of Anthos and PostgreSQL earlierin this tutorial,, you deployed the database as a StatefulSet with one primaryread/write replica to handle all incoming SQL statements. In this section, youconfigure horizontal Pod autoscaling to add new standby read-only replicas tohandle incoming SELECT statements. A good way to reduce the load on eachreplica is to distribute SELECT statements, which are read operations. ThePostgreSQL deployment includes a tool namedPgpool-II that achieves this loadbalancing and improves the system's throughput.

PostgreSQL exports the SELECT statement metric as aPrometheus metric.You'll use a lightweight metrics exporter namedprometheus-to-sd to send thesemetrics to Cloud Monitoring in a supported format.

  1. Review theHorizontalPodAutoscaler object:

    # Copyright 2022 Google LLC## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at##      http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.---apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:accounts-db-postgresqlspec:behavior:scaleUp:stabilizationWindowSeconds:0policies:-type:Percentvalue:100periodSeconds:5selectPolicy:MaxscaleTargetRef:apiVersion:apps/v1kind:StatefulSetname:accounts-db-postgresqlminReplicas:1maxReplicas:5metrics:-type:Externalexternal:metric:name:custom.googleapis.com|mypgpool|pgpool2_pool_backend_stats_select_cnttarget:type:AverageValueaverageValue:"15"

    This manifest does the following:

    • Sets the maximum number of replicas during a scale-up to5.
    • Sets the minimum number of during a scale-down to1.
    • Uses an external metric to make scaling decisions. In this sample, themetric is the number of SELECT statements. A scale-up event occurs ifthe incoming SELECT statement count surpasses 15.
  2. Apply the manifest to the cluster:

    kubectlapply-fextras/postgres-hpa/hpa/postgresql-hpa.yaml

Configure autoscaling for the web interface

InDeploy Bank of Anthos and PostgreSQL, you deployedthe Bank of Anthos web interface. When the number of users increases, theuserservice Service consumes more CPU resources. In this section, youconfigure horizontal Pod autoscaling for theuserservice Deployment when theexisting Pods use more than 60% of their requested CPU, and for thefrontendDeployment when the number of incoming HTTP requests to the load balancer ismore than 5 per second.

Configure autoscaling for the userservice Deployment

  1. Review theHorizontalPodAutoscaler manifest for theuserserviceDeployment:

    # Copyright 2022 Google LLC## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at##      http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.---apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:userservicespec:behavior:scaleUp:stabilizationWindowSeconds:0policies:-type:Percentvalue:100periodSeconds:5selectPolicy:MaxscaleTargetRef:apiVersion:apps/v1kind:Deploymentname:userserviceminReplicas:5maxReplicas:50metrics:-type:Resourceresource:name:cputarget:type:UtilizationaverageUtilization:60

    This manifest does the following:

    • Sets the maximum number of replicas during a scale-up to50.
    • Sets the minimum number of during a scale-down to5.
    • Uses a built-in Kubernetes metric to make scaling decisions. In thissample, the metric is CPU utilization, and the target utilization is60%, which avoids both over- and under-utilization.
  2. Apply the manifest to the cluster:

    kubectlapply-fextras/postgres-hpa/hpa/userservice.yaml

Configure autoscaling for the frontend deployment

  1. Review theHorizontalPodAutoscaler manifest for theuserserviceDeployment:

    # Copyright 2022 Google LLC## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at##      http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.---apiVersion:autoscaling/v2kind:HorizontalPodAutoscalermetadata:name:frontendspec:behavior:scaleUp:stabilizationWindowSeconds:0policies:-type:Percentvalue:100periodSeconds:5selectPolicy:MaxscaleTargetRef:apiVersion:apps/v1kind:Deploymentname:frontendminReplicas:5maxReplicas:25metrics:-type:Externalexternal:metric:name:loadbalancing.googleapis.com|https|request_countselector:matchLabels:resource.labels.forwarding_rule_name:FORWARDING_RULE_NAMEtarget:type:AverageValueaverageValue:"5"

    This manifest uses the following fields:

    • spec.scaleTargetRef: The Kubernetes resource to scale.
    • spec.minReplicas: The minimum number of replicas, which is5 in thissample.
    • spec.maxReplicas: The maximum number of replicas, which is25 inthis sample.
    • spec.metrics.*: The metric to use. In this sample, this is the numberof HTTP requests per second, which is a custom metric fromCloud Monitoring provided by the adapter that you deployed.
    • spec.metrics.external.metric.selector.matchLabels: The specificresource label to filter when autoscaling.
  2. Find the name of the forwarding rule from the load balancer to thefrontend Deployment:

    exportFW_RULE=$(kubectlgetingressfrontend-o=jsonpath='{.metadata.annotations.ingress\.kubernetes\.io/forwarding-rule}')echo$FW_RULE

    The output is similar to the following:

    k8s2-fr-j76hrtv4-default-frontend-wvvf7381
  3. Add your forwarding rule to the manifest:

    sed-i"s/FORWARDING_RULE_NAME/$FW_RULE/g""extras/postgres-hpa/hpa/frontend.yaml"

    This command replacesFORWARDING_RULE_NAME withyour saved forwarding rule.

  4. Apply the manifest to the cluster:

    kubectlapply-fextras/postgres-hpa/hpa/frontend.yaml

Checkpoint: Validate autoscaling setup

Get the state of yourHorizontalPodAutoscaler resources:

kubectlgethpa

The output is similar to the following:

NAME                     REFERENCE                            TARGETS             MINPODS   MAXPODS   REPLICAS   AGEaccounts-db-postgresql   StatefulSet/accounts-db-postgresql   10905m/15 (avg)     1         5         2          5m2scontacts                 Deployment/contacts                  1%/70%              1         5         1          11mfrontend                 Deployment/frontend                  <unknown>/5 (avg)   5         25        1          34suserservice              Deployment/userservice               0%/60%              5         50        5          4m56s

At this point, you've set up your application and configured autoscaling. Yourfrontend and database can now scale based on the metrics that youprovided.

Simulate load and observe GKE scaling

Bank of Anthos includes aloadgenerator Service that lets you simulate trafficto test your application scaling under load. In this section, you'll deploy theloadgenerator Service, generate a load, and observe the resulting scaling.

Deploy the load testing generator

  1. Create an environment variable with the IP address of the Bank of Anthosload balancer:

    exportLB_IP=$(kubectlgetingressfrontend-o=jsonpath='{.status.loadBalancer.ingress[0].ip}')echo$LB_IP

    The output is similar to the following:

    203.0.113.9
  2. Add the IP address of the load balancer to the manifest:

    sed-i"s/FRONTEND_IP_ADDRESS/$LB_IP/g""extras/postgres-hpa/loadgenerator.yaml"
  3. Apply the manifest to the cluster:

    kubectlapply-fextras/postgres-hpa/loadgenerator.yaml

The load generator begins adding one user every second, up to 250 users.

Simulate load

In this section, you use a load generator to simulate spikes in traffic andobserve your replica count and node count scale up to accommodate the increasedload over time. You then end the test and observe the replica and node countscale down in response.

  1. Expose the load generator web interface locally:

    kubectlport-forwardsvc/loadgenerator8080

    If you see an error message, try again when the Pod is running.

  2. In a browser, open the load generator web interface.

    • If you're using a local shell, open a browser and go tohttp://127.0.0.1:8080.
    • If you're using Cloud Shell, clickWeb preview, and thenclickPreview on port 8080.
  3. Click theCharts tab to observe performance over time.

  4. Open a new terminal window and watch the replica count of your horizontalPod autoscalers:

    kubectlgethpa-w

    The number of replicas increases as the load increases.The scaleup might take approximately ten minutes.

    NAME                     REFERENCE                            TARGETS          MINPODS   MAXPODS   REPLICASaccounts-db-postgresql   StatefulSet/accounts-db-postgresql   8326m/15 (avg)   1         5         5contacts                 Deployment/contacts                  51%/70%          1         5         2frontend                 Deployment/frontend                  5200m/5 (avg)    5         25        13userservice              Deployment/userservice               71%/60%          5         50        17
  5. Open another terminal window and check the number of nodes in the cluster:

    gcloudcontainerclusterslist\--filter='name=bank-of-anthos'\--format='table(name, currentMasterVersion, currentNodeVersion, currentNodeCount)'\--location="us-central1"
  6. The number of nodes increased from the starting quantity of three nodes toaccommodate the new replicas.

  7. Open the load generator interface and clickStop to end the test.

  8. Check the replica count and node count again and observe as the numbersreduce with the reduced load. The scale down might take some time, becausethe default stabilization window for replicas in the KubernetesHorizontalPodAutoscaler resource is five minutes. For more information,refer toStabilization window.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete individual resources

Google Cloud creates resources, such as load balancers, based on theKubernetes objects that you create. To delete all the resources in thistutorial, do the following:

  1. Delete the sample Kubernetes resources:

    kubectldelete\-fextras/postgres-hpa/loadgenerator.yaml\-fextras/postgres-hpa/hpa\-fextras/postgres-hpa/kubernetes-manifests\-fextras/jwt/jwt-secret.yaml\-fhttps://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml
  2. Delete the PostgreSQL database:

    helmuninstallaccounts-dbkubectldeletepvc-l"app.kubernetes.io/instance=accounts-db"kubectldeleteconfigmapsinitdb
  3. Delete the GKE cluster and the IAM serviceaccount:

    gcloudiamservice-accountsdelete"bank-of-anthos@PROJECT_ID.iam.gserviceaccount.com"--quietgcloudcontainerclustersdelete"bank-of-anthos"--location="us-central1"--quiet

Delete the project

    Caution: Deleting a project has the following effects:
    • Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
    • Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as anappspot.com URL, delete selected resources inside the project instead of deleting the whole project.

    If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

    Delete a Google Cloud project:

    gcloud projects deletePROJECT_ID

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.