Deploy an Elasticsearch vector database on GKE

This tutorial shows you how to deploy anElasticsearchvector database cluster on Google Kubernetes Engine (GKE).

Vector databasesare data stores specifically designed to manage and search through large collectionsof high-dimensional vectors. These vectors represent data like text, images, audio,video or any data that can be numerically encoded. Unlike relational databasesthat rely on exact matches, vector databases specialize in finding similar itemsor identifying patterns within massive datasets.

Elasticsearch is a vector database that combines search and analytics functionalities.It comes with an open REST API for managing your cluster, and supports structuredqueries, full-text queries, and complex queries. Elasticsearch lets youperform phrase, similarity, and prefix searches, with autocomplete suggestions.

This tutorial is intended for cloud platformadministrators and architects,ML engineers, and MLOps(DevOps) professionals interested in deploying Elasticsearchdatabase clusters on GKE.

Benefits

Elasticsearch offers the following benefits:

  • Wide range of libraries for various programming languages and open API tointegrate with other services.
  • Horizontal scaling, and support for sharding and replication that simplifiesscaling and high availability.
  • Multi-node cluster balancing for optimal resource utilization.
  • Container and Kubernetes support for seamless integration into moderncloud-native environments.

Objectives

In this tutorial, you learn how to:

  • Plan and deploy GKE infrastructure for Elasticsearch.
  • Deploy and configure Elasticsearch in a GKE cluster.
  • Deploy theStatefulHA operatorto ensure Elasticsearch high availability.
  • Run a notebook to generate and store example vector embeddings within yourdatabase, and perform vector-based search queries.
  • Collect and visualize metrics on a dashboard.

Deployment architecture

In this tutorial, you deploy a highly available regional GKE clusterfor Elasticsearch,with multiple Kubernetes nodes spread across several availability zones. Thissetup helps ensure fault tolerance, scalability, and geographic redundancy. Itallows for rolling updates and maintenance while providing SLAs for uptime andavailability. For more information, seeRegional clusters.

When a node becomes unreachable, a Pod on that node is not rescheduled immediately.With Pods using a StatefulSet, it can take more than eight minutes for applicationPods to be deleted and rescheduled to new nodes.

To address this issue, the StatefulHA operator does the following:

  • Solves rescheduling lag, handles failover settings and shortens recovery time by using.forceDeleteStrategy:AfterNodeUnreachable settings.
  • Ensures that the StatefulSet application is using RePD.
  • Extends GKE with a customHighAvailabilityApplication resource that's deployed in the same namespaceas Elasticsearch. This enables the StatefulHAoperator to monitor and respond to failover events.

The following diagram shows an Elasticsearch cluster running on multiple nodes and zonesin a GKE cluster:

Elasticsearch deployment architecture

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use thepricing calculator.

New Google Cloud users might be eligible for afree trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, seeClean up.

Elasticsearch usage is free under the Server Side Public License (SSPL).

Before you begin

In this tutorial, you useCloud Shell torun commands. Cloud Shell is a shell environment for managingresources hosted on Google Cloud. It comes preinstalled with theGoogle Cloud CLI,kubectl,Helm and Terraformcommand-line tools. If you don't use Cloud Shell, you must install the Google Cloud CLI.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. Install the Google Cloud CLI.

    Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.
  3. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  4. Toinitialize the gcloud CLI, run the following command:

    gcloudinit
  5. Create or select a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.
    • Create a Google Cloud project:

      gcloud projects createPROJECT_ID

      ReplacePROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set projectPROJECT_ID

      ReplacePROJECT_ID with your Google Cloud project name.

  6. Verify that billing is enabled for your Google Cloud project.

  7. Enable the Cloud Resource Manager, Compute Engine, GKE, IAMService Account Credentials, and Backup for GKE APIs:

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    gcloudservicesenablecloudresourcemanager.googleapis.com compute.googleapis.com container.googleapis.com iamcredentials.googleapis.com gkebackup.googleapis.com
  8. Install the Google Cloud CLI.

    Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.
  9. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  10. Toinitialize the gcloud CLI, run the following command:

    gcloudinit
  11. Create or select a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.
    • Create a Google Cloud project:

      gcloud projects createPROJECT_ID

      ReplacePROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set projectPROJECT_ID

      ReplacePROJECT_ID with your Google Cloud project name.

  12. Verify that billing is enabled for your Google Cloud project.

  13. Enable the Cloud Resource Manager, Compute Engine, GKE, IAMService Account Credentials, and Backup for GKE APIs:

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    gcloudservicesenablecloudresourcemanager.googleapis.com compute.googleapis.com container.googleapis.com iamcredentials.googleapis.com gkebackup.googleapis.com
  14. Grant roles to your user account. Run the following command once for each of the following IAM roles:roles/storage.objectViewer, roles/container.admin,roles/iam.serviceAccountAdmin, roles/compute.admin, roles/gkebackup.admin,roles/monitoring.viewer

    gcloudprojectsadd-iam-policy-bindingPROJECT_ID--member="user:USER_IDENTIFIER"--role=ROLE

    Replace the following:

    • PROJECT_ID: Your project ID.
    • USER_IDENTIFIER: The identifier for your user account. For example,myemail@example.com.
    • ROLE: The IAM role that you grant to your user account.

Set up your environment

To set up your environment with Cloud Shell, follow these steps:

  1. Set environment variables for your project, region, and a Kubernetescluster resource prefix:

    exportPROJECT_ID=PROJECT_IDexportKUBERNETES_CLUSTER_PREFIX=elasticsearchexportREGION=us-central1
    • ReplacePROJECT_ID with your Google Cloudproject ID.

    This tutorial usesus-central1 region to create your deploymentresources.

  2. Check the version of Helm:

    helmversion

    Update the version if it's older than 3.13:

    curlhttps://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3|bash
  3. Clone the sample code repository from GitHub:

    gitclonehttps://github.com/GoogleCloudPlatform/kubernetes-engine-samples
  4. Navigate to theelasticsearch directory to start creating deployment resources:

    cdkubernetes-engine-samples/databases/elasticsearch

Create your cluster infrastructure

In this section, you run a Terraform script to create a private, highly-available,regional GKE cluster to deploy your Elasticsearch database.

You can choose to deploy Elasticsearch using aStandard or Autopilot cluster.Each has its own advantages and different pricing models.

Autopilot

The following diagram shows an Autopilot GKE clusterdeployed in the project.

GKE Autopilot cluster

To deploy the cluster infrastructure, run the following commands in the Cloud Shell:

exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=terraform/gke-autopilotinitterraform-chdir=terraform/gke-autopilotapply\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

GKE replaces the following variables at runtime:

  • GOOGLE_OAUTH_ACCESS_TOKEN uses thegcloud auth print-access-tokencommand to retrieve an access token that authenticates interactions withvarious Google Cloud APIs
  • PROJECT_ID,REGION, andKUBERNETES_CLUSTER_PREFIX are the environmentvariables defined in theSet up your environment section and assignedto the new relevant variables for the Autopilot cluster you are creating.

When prompted, typeyes.

The output is similar to the following:

...Apply complete! Resources: 9 added, 0 changed, 0 destroyed.Outputs:kubectl_connection_command = "gcloud container clusters get-credentials elasticsearch-cluster --region us-central1"

Terraform creates the following resources:

  • A custom VPC network and private subnet for the Kubernetes nodes.
  • A Cloud Router to access the internet through Network Address Translation (NAT).
  • A private GKE cluster in theus-central1 region.
  • AServiceAccount with logging and monitoring permissions for the cluster.
  • Google Cloud Managed Service for Prometheus configuration forcluster monitoring and alerting.

Standard

The following diagram shows a Standard private regional GKE cluster deployedacross three different zones.

GKE Standard cluster

To deploy the cluster infrastructure, run the following commands in the Cloud Shell:

exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=terraform/gke-standardinitterraform-chdir=terraform/gke-standardapply\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

GKE replaces the following variables at runtime:

  • GOOGLE_OAUTH_ACCESS_TOKEN uses thegcloud auth print-access-tokencommand to retrieve an access token that authenticates interactions withvarious Google Cloud APIs.
  • PROJECT_ID,REGION, andKUBERNETES_CLUSTER_PREFIX are the environment variablesdefined inSet up your environment section and assigned to the newrelevant variables for the Standard cluster that you are creating.

When prompted, typeyes. It might take several minutes for these commands tocomplete and for the cluster to show a ready status.

The output is similar to the following:

...Apply complete! Resources: 10 added, 0 changed, 0 destroyed.Outputs:kubectl_connection_command = "gcloud container clusters get-credentials elasticsearch-cluster --region us-central1"

Terraform creates the following resources:

  • A custom VPC network and private subnet for the Kubernetes nodes.
  • A Cloud Router to access the internet through Network Address Translation (NAT).
  • A private GKE cluster in theus-central1 region with autoscaling enabled(one to two nodes per zone).
  • AServiceAccount with logging and monitoring permissions for the cluster.
  • Google Cloud Managed Service for Prometheus configuration for cluster monitoring and alerting.

Connect to the cluster

Configurekubectl to fetch credentials and communicate with your new GKE cluster:

gcloudcontainerclustersget-credentials\${KUBERNETES_CLUSTER_PREFIX}-cluster--location${REGION}

Deploy the Elasticsearch database and StatefulHA operator

In this section, you deploy the Elasticsearch database (in cluster mode) andStatefulHA operator to your GKE cluster using theECK Operator Helm Chart.

The Deployment creates a GKE cluster with the following configuration:

  • Three replicas of the Elasticsearch nodes.
  • DaemonSet to change virtual memory settings, for optimal Elasticsearch performance. A DaemonSet is a Kubernetes controller that ensures that a copy of a Pod runs on each node in a cluster.
  • Configuration of NodeAffinity and PodAntiAffinity to ensure proper distributionacross Kubernetes nodes, optimizing the use of node pools andmaximizing availability across different zones.
  • A Stateful HA operator that manages failover processes and ensureshigh availability. A StatefulSet is a Kubernetes controller that maintains a persistent unique identity for each of its Pods.
  • For authentication, the database creates Kubernetes Secrets with authenticationcredentials, passwords, and certificates.

To use the Helm chart to deploy the Elasticsearch database, follow these steps:

  1. Enable theStatefulHA add-on:

    Autopilot

    GKE automatically enablestheStatefulHA add-on at cluster creation.

    Standard

    Run the following command:

    gcloudcontainerclustersupdate${KUBERNETES_CLUSTER_PREFIX}-cluster\--project=${PROJECT_ID}\--location=${REGION}\--update-addons=StatefulHA=ENABLED

    It might take 15 minutes for this command to complete and for the clusterto show a ready status.

  2. Create an Elastic Cloud on Kubernetes (ECK) Custom Resource Definition (CRD):

    kubectlapply-fhttps://download.elastic.co/downloads/eck/2.11.1/crds.yaml
  3. Deploy the ECK operator:

    kubectlapply-fhttps://download.elastic.co/downloads/eck/2.11.1/operator.yaml
  4. Create the namespaceelastic for the database:

    kubectlcreatenselastic
  5. Install theHighAvailabilityApplication(HAA) resource, which defines failover rules for Elasticsearch..

    kubectlapply-nelastic-fmanifests/01-regional-pd/ha-app.yaml

    Theha-app.yaml manifest describes theHighAvailabilityApplication resource:

    kind:HighAvailabilityApplicationapiVersion:ha.gke.io/v1metadata:name:elasticsearch-ha-es-mainnamespace:elasticspec:resourceSelection:resourceKind:StatefulSetpolicy:storageSettings:requireRegionalStorage:falsefailoverSettings:forceDeleteStrategy:AfterNodeUnreachableafterNodeUnreachable:afterNodeUnreachableSeconds:20# 60 seconds total
  6. Apply the manifest to create a regional persistent SSD diskStorageClass:

    kubectlapply-nelastic-fmanifests/01-regional-pd/regional-pd.yaml

    Theregional-pd.yaml manifest describes the persistent SSD diskStorageClass:

    apiVersion:storage.k8s.io/v1kind:StorageClassallowVolumeExpansion:truemetadata:name:ha-regionalparameters:replication-type:regional-pdtype:pd-ssdavailability-class:regional-hard-failoverprovisioner:pd.csi.storage.gke.ioreclaimPolicy:RetainvolumeBindingMode:WaitForFirstConsumer
  7. Deploy the DaemonSet resource to set virtual memory in each node:

    kubectlapply-nelastic-fmanifests/02-elasticsearch/mmap-count.yaml

    Themmap-count.yaml manifest describes theDaemonSet:

    apiVersion:apps/v1kind:DaemonSetmetadata:name:max-map-count-setterlabels:k8s-app:max-map-count-setterspec:selector:matchLabels:name:max-map-count-settertemplate:metadata:labels:name:max-map-count-setterspec:initContainers:-name:max-map-count-setterimage:docker.io/bash:5.2.21resources:limits:cpu:100mmemory:32MisecurityContext:privileged:truerunAsUser:0command:['/usr/local/bin/bash','-e','-c','echo262144 >/proc/sys/vm/max_map_count']containers:-name:sleepimage:docker.io/bash:5.2.21command:['sleep','infinity']
  8. Apply the manifest to deploy Elasticsearch cluster:

    kubectlapply-nelastic-fmanifests/02-elasticsearch/elasticsearch.yaml

    Theelasticsearch.yaml manifest describes the Deployment:

    apiVersion:elasticsearch.k8s.elastic.co/v1kind:Elasticsearchmetadata:name:elasticsearch-haspec:version:8.11.4nodeSets:-name:maincount:3volumeClaimTemplates:-metadata:name:elasticsearch-dataspec:accessModes:-ReadWriteOnceresources:requests:storage:10GistorageClassName:ha-regionalconfig:podTemplate:metadata:labels:app.stateful/component:elasticsearchspec:initContainers:-name:max-map-count-checkcommand:['sh','-c',"whiletrue;dommc=$(cat/proc/sys/vm/max_map_count);if[${mmc}-eq262144];thenexit0;fi;sleep1;done"]containers:-name:metricsimage:quay.io/prometheuscommunity/elasticsearch-exporter:v1.7.0command:-/bin/elasticsearch_exporter---es.ssl-skip-verify---es.uri=https://$(ES_USER):$(ES_PASSWORD)@localhost:9200securityContext:runAsNonRoot:truerunAsGroup:10000runAsUser:10000resources:requests:memory:"128Mi"cpu:"25m"limits:memory:"128Mi"cpu:"100m"ports:-containerPort:9114env:-name:ES_USERvalue:"elastic"-name:ES_PASSWORDvalueFrom:secretKeyRef:name:elasticsearch-ha-es-elastic-userkey:elastic-name:elasticsearchresources:limits:memory:4Gicpu:1affinity:nodeAffinity:preferredDuringSchedulingIgnoredDuringExecution:-weight:1preference:matchExpressions:-key:app.stateful/componentoperator:Invalues:-elasticsearchpodAntiAffinity:preferredDuringSchedulingIgnoredDuringExecution:-weight:1podAffinityTerm:labelSelector:matchLabels:app.stateful/component:elasticsearchtopologyKey:topology.kubernetes.io/zone

    Wait for a few minutes for the Elasticsearch cluster to fully start.

  9. Check the deployment status:

    kubectlgetelasticsearch-nelastic--watch

    The output is similar to following, if theelasticsearch database is successfullydeployed:

    NAME               HEALTH   NODES   VERSION   PHASE   AGEelasticsearch-ha   green    3       8.11.4    Ready   2m30s

    Wait forHEALTH to show asgreen. PressCtrl+C to exitthe command if needed.

  10. Deploy an internal load balancer to access your Elasticsearch database that's runningin the same VPC as your GKE cluster:

    kubectlapply-nelastic-fmanifests/02-elasticsearch/ilb.yaml

    Theilb.yaml manifest describes theLoadBalancer Service:

    apiVersion:v1kind:Servicemetadata:annotations:#cloud.google.com/neg: '{"ingress": true}'networking.gke.io/load-balancer-type:"Internal"labels:app.kubernetes.io/name:elasticsearchname:elastic-ilbspec:ports:-name:httpsport:9200protocol:TCPtargetPort:9200selector:common.k8s.elastic.co/type:elasticsearchelasticsearch.k8s.elastic.co/cluster-name:elasticsearch-hatype:LoadBalancer
  11. To check if the failover rules are applied, describe the resource and confirmStatus: Message: Application is protected.

    kubectldescribehighavailabilityapplicationelasticsearch-ha-es-main-nelastic

    The output is similar to following

    Status:  Conditions:    Last Transition Time:  2024-02-01T13:27:50Z    Message:               Application is protected    Observed Generation:   1    Reason:                ApplicationProtected    Status:                True    Type:                  ProtectedEvents:                    <none>
  12. Once GKE starts the workloads, verify that GKEhas created the Elasticsearch workloads:

    kubectlgetpod,svc,statefulset,pdb,secret,daemonset-nelastic

    The output is similar to the following:

    NAME                             READY   STATUS    RESTARTS   AGEpod/elasticsearch-ha-es-main-0   2/2     Running   0          7m16spod/elasticsearch-ha-es-main-1   2/2     Running   0          7m16spod/elasticsearch-ha-es-main-2   2/2     Running   0          7m16spod/max-map-count-setter-28wt9   1/1     Running   0          7m27spod/max-map-count-setter-cflsw   1/1     Running   0          7m27spod/max-map-count-setter-gzq9k   1/1     Running   0          7m27sNAME                                        TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGEservice/elasticsearch-ha-es-http            ClusterIP   10.52.8.28   <none>        9200/TCP   7m18sservice/elasticsearch-ha-es-internal-http   ClusterIP   10.52.3.48   <none>        9200/TCP   7m18sservice/elasticsearch-ha-es-main            ClusterIP   None         <none>        9200/TCP   7m16sservice/elasticsearch-ha-es-transport       ClusterIP   None         <none>        9300/TCP   7m18sNAME                                        READY   AGEstatefulset.apps/elasticsearch-ha-es-main   3/3     7m16sNAME                                                     MIN AVAILABLE   MAX UNAVAILABLE   ALLOWED DISRUPTIONS   AGEpoddisruptionbudget.policy/elasticsearch-ha-es-default   2               N/A               1                     7m16sNAME                                                 TYPE     DATA   AGEsecret/elasticsearch-ha-es-elastic-user              Opaque   1      7m18ssecret/elasticsearch-ha-es-file-settings             Opaque   1      7m16ssecret/elasticsearch-ha-es-http-ca-internal          Opaque   2      7m17ssecret/elasticsearch-ha-es-http-certs-internal       Opaque   3      7m17ssecret/elasticsearch-ha-es-http-certs-public         Opaque   2      7m17ssecret/elasticsearch-ha-es-internal-users            Opaque   4      7m18ssecret/elasticsearch-ha-es-main-es-config            Opaque   1      7m16ssecret/elasticsearch-ha-es-main-es-transport-certs   Opaque   7      7m16ssecret/elasticsearch-ha-es-remote-ca                 Opaque   1      7m16ssecret/elasticsearch-ha-es-transport-ca-internal     Opaque   2      7m16ssecret/elasticsearch-ha-es-transport-certs-public    Opaque   1      7m16ssecret/elasticsearch-ha-es-xpack-file-realm          Opaque   4      7m18sNAME                                  DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGEdaemonset.apps/max-map-count-setter   6         6         6       6            6           <none>          13m

The following GKE resources are created for the Elasticsearchcluster:

  • The ElasticsearchStatefulSet that controls three Pod replicas.
  • A DaemonSet to configure virtual memory settings.
  • Services to connect to Elasticsearch.
  • Secrets with superuser credentials and service-related certificates.
  • Stateful HA operator Pod andHighlyAvailableApplication resource, activelymonitoring the Elasticsearch application.

Run queries with Vertex AI Colab Enterprise notebook

This section explains how to generate embeddings intoElasticsearch documents and performsemantic search queries using the official Elasticsearch Python client in Colab Enterprisenotebook. A document in Elasticsearch is composed of various fields, each pairedwith its corresponding value.

For more information about Vertex AI Colab Enterprise, seeColab Enterprise documentation.

Best practice:

To effectively utilize Elasticsearch, we recommend that you structure your datainto these documents, which are then indexed for search purposes.

In this example, you use a dataset from a CSV file that contains a list of booksin different genres. Elasticsearch serves as a search engine, and the Pod you createserves as a client querying the Elasticsearch database.

You can use a dedicated runtime template to deploy to theelasticsearch-vpc VPC(Virtual Private Cloud), so thenotebook can communicate with resources in your GKE cluster.

Create a runtime template

To create a Colab Enterprise runtime template:

  1. In the Google Cloud console, go to the Colab EnterpriseRuntime Templatespage and make sure your project is selected:

    Go to Runtime Templates

  2. ClickNew Template. TheCreate new runtime template page appears.

  3. In theRuntime basics section:

    • In theDisplay name field, enterelastic-connect.
    • In theRegion drop-down list, selectus-central1. It's the same region as your GKE cluster.
  4. In theConfigure compute section:

    • In theMachine type drop-down list, selecte2-standard-2.
    • In theDisk size field, enter30.
  5. In theNetworking and security section:

    • In theNetwork drop-down list, select the network where yourGKE cluster resides.
    • In theSubnetwork drop-down list, select a corresponding subnetwork.
    • Clear theEnable public internet access checkbox.
  6. To finish creating the runtime template, clickCreate. Your runtime templateappears in the list on theRuntime templates tab.

Create a runtime

To create a Colab Enterprise runtime:

  1. In the runtime templates list for the template you just created, in theActions column,click and thenclickCreate runtime. TheCreate Vertex AI Runtime pane appears.

  2. To create a runtime based on your template, clickCreate.

  3. On theRuntimes tab that opens, wait for the status to transition toHealthy.

Import the notebook

To import the notebook in Colab Enterprise:

  1. Go to theMy Notebooks tab and clickImport. TheImport notebookspane appears.

  2. InImport source, selectURL.

  3. UnderNotebook URLs, enter the following link:

    https://raw.githubusercontent.com/GoogleCloudPlatform/kubernetes-engine-samples/main/databases/elasticsearch/manifests/03-notebook/vector-database.ipynb
  4. ClickImport.

Connect to the runtime and run queries

To connect to the runtime and run queries:

  1. In the notebook, next to theConnect button, clickAdditional connection options.TheConnect to Vertex AI Runtime pane appears.

  2. SelectConnect to a runtime and then selectConnect to an existing Runtime.

  3. Select the runtime that you launched and clickConnect.

  4. To run the notebook cells, click theRun cell button next to each code cell.

The notebook contains both code cells and text that describes each code block. Runninga code cell executes its commands and displays an output. You can run the cellsin order, or run individual cells as needed.

View Prometheus metrics for your cluster

The GKE cluster is configured withGoogle Cloud Managed Service for Prometheus,which enables collection of metrics in the Prometheus format. This service providesa fully managed solution for monitoring and alerting, allowing for collection, storage,and analysis of metrics from the cluster and its applications.

The following diagram shows how Prometheus collects metrics for your cluster:

Prometheus metrics collection

The GKE private cluster in the diagram contains the following components:

  • Elasticsearch Pods that expose metrics on the path/ and port9114.These metrics are provided by the sidecar container namedmetrics that containstheelasticsearch_exporter.
  • Prometheus-based collectors that process the metrics from the Elasticsearch Pod.
  • A PodMonitoring resource that sends the metrics to Cloud Monitoring.

The cluster configuration defines a sidecar container with metrics exporter in thePrometheus format:

apiVersion: elasticsearch.k8s.elastic.co/v1kind: Elasticsearchmetadata:  name: elasticsearch-haspec:  ...  nodeSets:  - name: main    ...    podTemplate:      spec:        containers:        ...        - name: metrics          image: quay.io/prometheuscommunity/elasticsearch-exporter:v1.7.0          command:          - /bin/elasticsearch_exporter          - --es.ssl-skip-verify          - --es.uri=https://$(ES_USER):$(ES_PASSWORD)@localhost:9200          ...          env:          - name: ES_USER            value: "elastic"          - name: ES_PASSWORD            valueFrom:            secretKeyRef:              name: elasticsearch-ha-es-elastic-user              key: elastic

To export and view the metrics, follow these steps:

  1. Create thePodMonitoringresource to scrape metrics bylabelSelector:

    kubectlapply-nelastic-fmanifests/04-prometheus-metrics/pod-monitoring.yaml

    Thepod-monitoring.yaml manifest describes thePodMonitoring resource:

    apiVersion:monitoring.googleapis.com/v1kind:PodMonitoringmetadata:name:elasticsearchspec:selector:matchLabels:app.stateful/component:elasticsearchelasticsearch.k8s.elastic.co/cluster-name:elasticsearch-haendpoints:-port:9114interval:30spath:/metrics

    After a few minutes, the built-in dashboard "Elasticsearch Prometheus Overview"displays.

  2. To view more data-related graphs, import a customCloud Monitoring dashboard with the configurations defined indashboard.json:

    gcloud--project"${PROJECT_ID}"monitoringdashboardscreate--config-from-filemonitoring/dashboard.json
  3. After the command runs successfully, go to the Cloud MonitoringDashboards:

    Go to Dashboards overview

  4. From the list of dashboards, open theElasticSearch Overview dashboard. It mighttake 1-2 minutes to collect and display metrics.

    The dashboard shows a count of key metrics:

    • Indexes
    • Documents and Shards
    • Pending operations
    • Running nodes with their health statuses

Back up your cluster configuration

TheBackup for GKEfeature lets you schedule regular backups of your entire GKE clusterconfiguration, including the deployed workloads and their data.

In this tutorial, you configure a backup plan for your GKE cluster to performbackups of all workloads, including Secrets and Volumes, every day at 3 AM.To ensure efficient storage management, backups older than three days areautomatically deleted.

  1. Enable the Backup for GKE feature for your cluster:

    gcloudcontainerclustersupdate${KUBERNETES_CLUSTER_PREFIX}-cluster\--project=${PROJECT_ID}\--location=${REGION}\--update-addons=BackupRestore=ENABLED
  2. Create a backup plan with a daily schedule for all namespaces within the cluster:

    gcloudbetacontainerbackup-restorebackup-planscreate${KUBERNETES_CLUSTER_PREFIX}-cluster-backup\--project=${PROJECT_ID}\--location=${REGION}\--cluster="projects/${PROJECT_ID}/locations/${REGION}/clusters/${KUBERNETES_CLUSTER_PREFIX}-cluster"\--all-namespaces\--include-secrets\--include-volume-data\--cron-schedule="0 3 * * *"\--backup-retain-days=3

    The command uses the relevant environment variables at runtime.

    The cluster name's format is relative to your project and region as follows:

    projects/PROJECT_ID/locations/REGION/clusters/CLUSTER_NAME

    When prompted, typey.The output is similar to the following:

    Create request issued for: [elasticsearch-cluster-backup]Waiting for operation [projects/PROJECT_ID/locations/us-central1/operations/operation-1706528750815-610142ffdc9ac-71be4a05-f61c99fc] to complete...⠹

    This operation might take a few minutes to complete successfully. After theexecution is complete, the output is similar to the following:

    Created backup plan [elasticsearch-cluster-backup].
  3. You can see your newly created backup planelasticsearch-cluster-backup listed onthe Backup for GKE console.

    Go to Backup for GKE

If you want to restore the saved backup configurations, seeRestore a backup.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

The easiest way to avoid billing is to delete the project you created forthis tutorial.

Caution: Deleting a project has the following effects:
  • Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
  • Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as anappspot.com URL, delete selected resources inside the project instead of deleting the whole project.

If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

Delete a Google Cloud project:

gcloud projects deletePROJECT_ID

If you deleted the project, your clean up is complete. If you didn't delete theproject, proceed to delete the individual resources.

Delete individual resources

  1. Set environment variables.

    exportPROJECT_ID=${PROJECT_ID}exportKUBERNETES_CLUSTER_PREFIX=elasticsearchexportREGION=us-central1
  2. Run theterraform destroy command:

    exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=terraform/FOLDERdestroy\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

    ReplaceFOLDER with eithergke-autopilot orgke-standard, depending on thetype of GKE cluster you created.

    When prompted, typeyes.

  3. Find all unattached disks:

    exportdisk_list=$(gcloudcomputediskslist--filter="-users:* AND labels.name=${KUBERNETES_CLUSTER_PREFIX}-cluster"--format"value[separator=|](name,region)")
  4. Delete the disks:

    foriin$disk_list;dodisk_name=$(echo$i|cut-d'|'-f1)disk_region=$(echo$i|cut-d'|'-f2|sed's|.*/||')echo"Deleting$disk_name"gcloudcomputedisksdelete$disk_name--region$disk_region--quietdone
  5. Delete the GitHub repository:

    rm-r~/kubernetes-engine-samples/

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-10-30 UTC.