Deploy a Qdrant vector database on GKE

This guide shows you how to deploy aQdrant vector databasecluster on Google Kubernetes Engine (GKE).

Vector databases are data stores specifically designed to manageand search through large collections of high-dimensional vectors. These vectorsrepresent data like text, images, audio, video or any data that can be numericallyencoded. Unlike traditional databases that rely on exact matches, vector databasesspecialize in finding similar items or identifying patterns within massive datasets.These characteristics make Qdrant a suitable choice for a variety of applications,including neural network or semantic-based matching, faceted search, and more.Qdrant not only functions as a vector database butalso as a vector similarity search engine.

This tutorial is intended for cloud platformadministrators and architects,ML engineers, and MLOps(DevOps) professionals interested in deploying Qdrant databaseclusters on GKE.

Benefits

Qdrant offers the following benefits:

Wide range of libraries for various programming languages and open API tointegrate with other services.
Horizontal scaling, and support for sharding and replication that simplifiesscaling and high availability.
Container and Kubernetes support that enables deployment and management in moderncloud-native environments.
Flexible payloads withadvanced filteringto tailor search criteria precisely.
Different quantization optionsand other optimizations to reduce infrastructure costs and improve performance.

Objectives

In this tutorial, you learn how to:

Plan and deploy GKE infrastructure for Qdrant.
Deploy theStatefulHA operatorto ensure Qdrant high availability.
Deploy and configure the Qdrant cluster.
Upload a demo dataset and run a simple search query.
Collect metrics and run a dashboard.

Deployment architecture

This architecture sets up a fault-tolerant, scalable GKEcluster for Qdrant across multiple availability zones, ensuring uptime andavailability with rolling updates and minimal disruption. It includes using theStatefulHA operatorfor efficient failover management. For more information,seeRegional clusters.

Architecture diagram

The following diagram shows a Qdrant cluster running on multiple nodes and zonesin a GKE cluster:

Qdrant deployment architecture

In this architecture, the QdrantStatefulSet is deployed across three nodes in three different zones.

You can control how GKE distributes Pods across nodes by configuring therequired Podaffinity rules andtopology spread constraintsin the Helm chart values file.
If one zone fails, GKE reschedules Pods on new nodes based on the recommendedconfiguration.

For data persistence, the architecture in this tutorial has the following characteristics:

It usesregionalSSD disks (customregional-pdStorageClass) for persisting data. Werecommendregional SSD disks for databases due to their low latency and high IOPS.
All disk data is replicated between primary and secondary zones in the region,increasing tolerance to potential zone failures.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use thepricing calculator.

New Google Cloud users might be eligible for afree trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, seeClean up.

Before you begin

In this tutorial, you useCloud Shell torun commands. Cloud Shell is a shell environment for managingresources hosted on Google Cloud. It comes preinstalled with theGoogle Cloud CLI,kubectl,Helm and Terraformcommand-line tools. If you don't use Cloud Shell, you must install the Google Cloud CLI.

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

Install the Google Cloud CLI.

Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.

If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

Toinitialize the gcloud CLI, run the following command:

gcloudinit

Create or select a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.

Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

Create a Google Cloud project:
```
gcloud projects createPROJECT_ID
```
ReplacePROJECT_ID with a name for the Google Cloud project you are creating.
Select the Google Cloud project that you created:
```
gcloud config set projectPROJECT_ID
```
ReplacePROJECT_ID with your Google Cloud project name.

Verify that billing is enabled for your Google Cloud project.

Enable the Resource Manager, Compute Engine, GKE, IAMService Account Credentials, and Backup for GKE APIs:

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

gcloudservicesenablecloudresourcemanager.googleapis.com compute.googleapis.com container.googleapis.com iamcredentials.googleapis.com gkebackup.googleapis.com

Install the Google Cloud CLI.

Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.

If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

Toinitialize the gcloud CLI, run the following command:

gcloudinit

Create or select a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.

Create a Google Cloud project:
```
gcloud projects createPROJECT_ID
```
ReplacePROJECT_ID with a name for the Google Cloud project you are creating.
Select the Google Cloud project that you created:
```
gcloud config set projectPROJECT_ID
```
ReplacePROJECT_ID with your Google Cloud project name.

Verify that billing is enabled for your Google Cloud project.

Enable the Resource Manager, Compute Engine, GKE, IAMService Account Credentials, and Backup for GKE APIs:

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

gcloudservicesenablecloudresourcemanager.googleapis.com compute.googleapis.com container.googleapis.com iamcredentials.googleapis.com gkebackup.googleapis.com

Grant roles to your user account. Run the following command once for each of the following IAM roles:roles/storage.objectViewer, roles/container.admin,roles/iam.serviceAccountAdmin, roles/compute.admin, roles/gkebackup.admin,roles/monitoring.viewer
```
gcloudprojectsadd-iam-policy-bindingPROJECT_ID--member="user:USER_IDENTIFIER"--role=ROLE
```
Replace the following:
- PROJECT_ID: Your project ID.
- USER_IDENTIFIER: The identifier for your user account. For example,myemail@example.com.
- ROLE: The IAM role that you grant to your user account.

Set up your environment

To set up your environment with Cloud Shell, follow these steps:

Set environment variables for your project, region, and a Kubernetescluster resource prefix:
For the purpose of this tutorial, useus-central1 region to create your deploymentresources.
```
exportPROJECT_ID=PROJECT_IDexportKUBERNETES_CLUSTER_PREFIX=qdrantexportREGION=us-central1
```
- ReplacePROJECT_ID with your Google Cloudproject ID.

Check the version of Helm:

helmversion

Update the version if it's older than 3.13:

curlhttps://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3|bash

Clone the sample code repository from GitHub:

gitclonehttps://github.com/GoogleCloudPlatform/kubernetes-engine-samples

Navigate to theqdrant directory to start creating deployment resources:
```
cdkubernetes-engine-samples/databases/qdrant
```

Create your cluster infrastructure

This section involves running a Terraform script to create a private, highly-available,regional GKE cluster to deploy your Qdrant database.

You can choose to deploy Qdrant using aStandard or Autopilot cluster. Each has its own advantagesand different pricing models.

Autopilot

The following diagram shows an Autopilot regional GKE clusterdeployed across three different zones.

GKE Autopilot cluster

To deploy the cluster infrastructure, run the following commands in the Cloud Shell:

exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=terraform/gke-autopilotinitterraform-chdir=terraform/gke-autopilotapply\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

The following variables are replaced at runtime:

GOOGLE_OAUTH_ACCESS_TOKEN: Replaced by an access token retrieved bygcloud auth print-access-token command to authenticate interactions withvarious Google Cloud APIs
PROJECT_ID,REGION, andKUBERNETES_CLUSTER_PREFIX are the environmentvariables defined inSet up your environment section and assignedto the new relevant variables for the Autopilot cluster you are creating.

When prompted, typeyes.

The output is similar to the following:

...Apply complete! Resources: 9 added, 0 changed, 0 destroyed.Outputs:kubectl_connection_command = "gcloud container clusters get-credentials qdrant-cluster --region us-central1"

Terraform creates the following resources:

A custom VPC network and private subnet for the Kubernetes nodes.
A Cloud Router to access the internet through Network Address Translation (NAT).
A private GKE cluster in theus-central1 region.
AServiceAccount with logging and monitoring permissions for the cluster.
Google Cloud Managed Service for Prometheus configuration forcluster monitoring and alerting.

Standard

The following diagram shows a Standard private regional GKE cluster deployedacross three different zones.

GKE Standard cluster

To deploy the cluster infrastructure, run the following commands in the Cloud Shell:

exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=terraform/gke-standardinitterraform-chdir=terraform/gke-standardapply\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

The following variables are replaced at runtime:

GOOGLE_OAUTH_ACCESS_TOKEN is replaced by an access token retrieved bygcloud auth print-access-token command to authenticate interactions with variousGoogle Cloud APIs.
PROJECT_ID,REGION, andKUBERNETES_CLUSTER_PREFIX are the environment variablesdefined inSet up your environment section and assigned to the newrelevant variables for the Standard cluster that you are creating.

When prompted, typeyes. It might take several minutes for these commands tocomplete and for the cluster to show a ready status.

The output is similar to the following:

...Apply complete! Resources: 10 added, 0 changed, 0 destroyed.Outputs:kubectl_connection_command = "gcloud container clusters get-credentials qdrant-cluster --region us-central1"

Terraform creates the following resources:

A custom VPC network and private subnet for the Kubernetes nodes.
A Cloud Router to access the internet through Network Address Translation (NAT).
A private GKE cluster in theus-central1 region with autoscaling enabled(one to two nodes per zone).
AServiceAccount with logging and monitoring permissions for the cluster.
Google Cloud Managed Service for Prometheus configuration for cluster monitoring and alerting.

Connect to the cluster

Configurekubectl to fetch credentials and communicate with your new GKE cluster:

gcloudcontainerclustersget-credentials\${KUBERNETES_CLUSTER_PREFIX}-cluster--location${REGION}

Deploy the Qdrant database to your cluster

In this tutorial, you deploy the Qdrant database (indistributed mode)and the Stateful HA operator to your GKE cluster clusterusing theHelm chart.

The deployment creates a GKE cluster with the following configuration:

Three replicas of the Qdrant nodes.
Tolerations, node affinities, and topology spread constraints are configuredto ensure proper distribution across Kubernetes nodes. This leverages the nodepools and different availability zones.
A RePD volume with the SSD disk type is provisioned for data storage.
A Stateful HA operator is used to manage failover processes and ensurehigh availability. A StatefulSet is a Kubernetes controller that maintains a persistent unique identity for each of its Pods.
For authentication, the database creates a Kubernetes secret containing the API key.

To use the Helm chart to deploy Qdrant database, follow these steps:

Enable theStatefulHA add-on:
Autopilot
GKE automatically enablestheStatefulHA add-on at cluster creation.
Standard
Run the following command:
```
gcloudcontainerclustersupdate${KUBERNETES_CLUSTER_PREFIX}-cluster\--project=${PROJECT_ID}\--location=${REGION}\--update-addons=StatefulHA=ENABLED
```
It might take 15 minutes for this command to complete and for the clusterto show a ready status.
Add the Qdrant database Helm Chart repository before you can deploy it on yourGKE cluster:
```
helmrepoaddqdranthttps://qdrant.github.io/qdrant-helm
```
Create namespaceqdrant for the database:
```
kubectlcreatensqdrant
```

Apply the manifest to create a regional persistent SSD diskStorageClass:

kubectlapply-nqdrant-fmanifests/01-regional-pd/regional-pd.yaml

Theregional-pd.yaml manifest describes the persistent SSD diskStorageClass:

apiVersion:storage.k8s.io/v1kind:StorageClassallowVolumeExpansion:truemetadata:name:ha-regionalparameters:replication-type:regional-pdtype:pd-ssdavailability-class:regional-hard-failoverprovisioner:pd.csi.storage.gke.ioreclaimPolicy:RetainvolumeBindingMode:WaitForFirstConsumer

Deploy a Kubernetes configmap with ametrics sidecar configuration and a Qdrantcluster by using Helm:

kubectlapply-nqdrant-fmanifests/03-prometheus-metrics/metrics-cm.yamlhelminstallqdrant-databaseqdrant/qdrant-nqdrant\-fmanifests/02-values-file/values.yaml

Themetrics-cm.yaml manifest describes themetrics sidecarConfigMap:

apiVersion:v1kind:ConfigMapmetadata:name:nginx-confdata:default.conf.template:|server {listen 80;location / {proxy_pass http://localhost:6333/metrics;proxy_http_version 1.1;proxy_set_header Host $http_host;proxy_set_header api-key ${QDRANT_APIKEY};proxy_set_header X-Forwarded-For $remote_addr;}}

Thevalues.yaml manifest describes the Qdrant cluster configuration :

replicaCount:3config:service:enable_tls:falsecluster:enabled:truestorage:optimizers:deleted_threshold:0.5vacuum_min_vector_number:1500default_segment_number:2max_segment_size_kb:nullmemmap_threshold_kb:nullindexing_threshold_kb:25000flush_interval_sec:5max_optimization_threads:1livenessProbe:enabled:trueinitialDelaySeconds:60resources:limits:cpu:"2"memory:4Girequests:cpu:"1"memory:4Gitolerations:-key:"app.stateful/component"operator:"Equal"value:"qdrant"effect:NoScheduleaffinity:nodeAffinity:preferredDuringSchedulingIgnoredDuringExecution:-weight:1preference:matchExpressions:-key:"app.stateful/component"operator:Invalues:-"qdrant"topologySpreadConstraints:-maxSkew:1topologyKey:"topology.kubernetes.io/zone"whenUnsatisfiable:ScheduleAnywaylabelSelector:matchLabels:app.kubernetes.io/name:qdrantapp.kubernetes.io/instance:qdrantpodDisruptionBudget:enabled:truemaxUnavailable:1persistence:accessModes:["ReadWriteOnce"]size:10GistorageClassName:ha-regionalapiKey:truesidecarContainers:-name:metricsimage:nginx:1.29resources:requests:memory:"128Mi"cpu:"250m"limits:memory:"128Mi"cpu:"500m"ports:-containerPort:80env:-name:QDRANT_APIKEYvalueFrom:secretKeyRef:name:qdrant-database-apikeykey:api-keyvolumeMounts:-name:nginx-confmountPath:/etc/nginx/templates/default.conf.templatesubPath:default.conf.templatereadOnly:trueadditionalVolumes:-name:nginx-confconfigMap:name:nginx-confitems:-key:default.conf.templatepath:default.conf.template

This configuration enables the cluster mode, allowing you to setup a highlyavailable and distributed Qdrant cluster.

Add a label to Qdrant statefulset:

kubectllabelstatefulsetqdrant-databaseexamples.ai.gke.io/source=qdrant-guide-nqdrant

Deploy an internal load balancer to access your Qdrant database that's runningin the same VPC as your GKE cluster:

kubectlapply-nqdrant-fmanifests/02-values-file/ilb.yaml

Theilb.yaml manifest describes theLoadBalancer Service:

apiVersion:v1kind:Servicemetadata:annotations:#cloud.google.com/neg: '{"ingress": true}'networking.gke.io/load-balancer-type:"Internal"labels:app.kubernetes.io/name:qdrantname:qdrant-ilbspec:ports:-name:httpport:6333protocol:TCPtargetPort:6333-name:grpcport:6334protocol:TCPtargetPort:6334selector:app:qdrantapp.kubernetes.io/instance:qdrant-databasetype:LoadBalancer

Check the deployment status:

helmls-nqdrant

The output is similar to the following, if theqdrant database is successfully deployed:

NAME    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSIONqdrant-database  qdrant          1               2024-02-06 20:21:15.737307567 +0000 UTC deployed        qdrant-0.7.6    v1.7.4

Wait for GKE to start the required workloads:
```
kubectlwaitpods-lapp.kubernetes.io/instance=qdrant-database--forcondition=Ready--timeout=300s-nqdrant
```
This command might take a few minutes to complete successfully.
Once GKE starts the workloads, verify that GKE has createdthe Qdrant workloads:
```
kubectlgetpod,svc,statefulset,pdb,secret-nqdrant
```
Start theHighAvailabilityApplication (HAA) resource for Qdrant:
```
kubectlapply-nqdrant-fmanifests/01-regional-pd/ha-app.yaml
```
Theha-app.yaml manifest describes theHighAvailabilityApplication resource:
```
kind:HighAvailabilityApplicationapiVersion:ha.gke.io/v1metadata:name:qdrant-databasenamespace:qdrantspec:resourceSelection:resourceKind:StatefulSetpolicy:storageSettings:requireRegionalStorage:truefailoverSettings:forceDeleteStrategy:AfterNodeUnreachableafterNodeUnreachable:afterNodeUnreachableSeconds:20# 60 seconds total
```
The following GKE resources are created for the Qdrant cluster:
- The QdrantStatefulSet that controls three Pod replicas.
- A PodDisruptionBudget, ensuring a maximum of one unavailable replica.
- Theqdrant-database Service, exposing the Qdrant port for inbound connectionsand replication between nodes.
- Theqdrant-database-headless Service, providing the list of running QdrantPods.
- Theqdrant-database-apikey Secret, facilitating secure database connection.
- Stateful HA operator Pod andHighlyAvailableApplication resource, actively monitoring the Qdrantapplication. TheHighlyAvailableApplication resource defines failover rules to apply against Qdrant.

To check if the failover rules are applied, describe the resource and confirmStatus: Message: Application is protected.

kubectldescribehighavailabilityapplicationqdrant-database-nqdrant

The output is similar to the following:

Status:Conditions:    Last Transition Time:  2023-11-30T09:54:52Z    Message:               Application is protected    Observed Generation:   1    Reason:                ApplicationProtected    Status:                True    Type:                  Protected

Run queries with Vertex AI Colab Enterprise notebook

Qdrant organizes vectors and payloads in collections. Vector embedding is a techniquethat represents words or entities as numerical vectors while maintaining theirsemantic relationships. This is important for similarity searches as it enablesfinding similarities based on meaning rather than exact matches, making tasks likesearch and recommendation systems more effective and nuanced.

This section shows you how toupload Vectorsinto a new QdrantCollectionand run a search queries.

In this example, you use a dataset from a CSV file that contains a list of booksin different genres. You create a Colab Enterprise notebook to performa search query on the Qdrant database.

For more information about Vertex AI Colab Enterprise, seeColab Enterprise documentation.

Create a runtime template

To create a Colab Enterprise runtime template:

In the Google Cloud console, go to the Colab EnterpriseRuntime Templatespage and make sure your project is selected:
Go to Runtime Templates
ClickNew Template. TheCreate new runtime template page appears.
In theRuntime basics section:
- In theDisplay name field, enterqdrant-connect.
- In theRegion drop-down list, selectus-central1. It's the same region as your GKE cluster.
In theConfigure compute section:
- In theMachine type drop-down list, selecte2-standard-2.
- In theDisk size field, enter30.
In theNetworking and security section:
- In theNetwork drop-down list, select the network where yourGKE cluster resides.
- In theSubnetwork drop-down list, select a corresponding subnetwork.
- Clear theEnable public internet access checkbox.
To finish creating the runtime template, clickCreate. Your runtime templateappears in the list on theRuntime templates tab.

Create a runtime

To create a Colab Enterprise runtime:

In the runtime templates list for the template you just created, in theActions column,click and thenclickCreate runtime. TheCreate Vertex AI Runtime pane appears.
To create a runtime based on your template, clickCreate.
On theRuntimes tab that opens, wait for the status to transition toHealthy.

Import the notebook

To import the notebook in Colab Enterprise:

Go to theMy Notebooks tab and clickImport. TheImport notebookspane appears.
InImport source, selectURL.

UnderNotebook URLs, enter the following link:

https://raw.githubusercontent.com/GoogleCloudPlatform/kubernetes-engine-samples/refs/heads/main/databases/qdrant/manifests/04-notebook/vector-database.ipynb

ClickImport.

Connect to the runtime and run queries

To connect to the runtime and run queries:

In the notebook, next to theConnect button, clickAdditional connection options.TheConnect to Vertex AI Runtime pane appears.
SelectConnect to a runtime and then selectConnect to an existing Runtime.
Select the runtime that you launched and clickConnect.
To run the notebook cells, click theRun cell button next to each code cell.

The notebook contains both code cells and text that describes each code block. Runninga code cell executes its commands and displays an output. You can run the cellsin order, or run individual cells as needed.

View Prometheus metrics for your cluster

The GKE cluster is configured withGoogle Cloud Managed Service for Prometheus, which enablescollection of metrics in the Prometheus format. This service provides a fullymanaged solution for monitoring and alerting, allowing for collection, storage,and analysis of metrics from the cluster and its applications.

The following diagram shows how Prometheus collects metrics for your cluster:

Prometheus metrics collection

The GKE private cluster in the diagram contains the following components:

Qdrant Pods that expose metrics on the path/ and port80. These metricsare provided by the sidecar container namedmetrics.
Prometheus-based collectors that process the metrics from the Qdrant Pods.
A PodMonitoring resource that sends the metrics to Cloud Monitoring.

To export and view the metrics, follow these steps:

Create thePodMonitoringresource to scrape metrics bylabelSelector:

kubectlapply-nqdrant-fmanifests/03-prometheus-metrics/pod-monitoring.yaml

Thepod-monitoring.yaml manifest describes thePodMonitoring resource:

apiVersion:monitoring.googleapis.com/v1kind:PodMonitoringmetadata:name:qdrantspec:selector:matchLabels:app:qdrantapp.kubernetes.io/instance:qdrant-databaseendpoints:-port:80interval:30spath:/

Create a Cloud Monitoring dashboardwith the configurations defined indashboard.json :

gcloud--project"${PROJECT_ID}"monitoringdashboardscreate--config-from-filemonitoring/dashboard.json

After the command runs successfully, go to the Cloud MonitoringDashboards:
Go to Dashboards overview
From the list of dashboards, open theQdrant Overview dashboard. It mighttake 1-2 minutes to collect and display metrics.
The dashboard shows a count of key metrics:
- Collections
- Embedded vectors
- Pending operations
- Running nodes

Back up your cluster configuration

TheBackup for GKEfeature lets you schedule regular backups of your entire GKE clusterconfiguration, including the deployed workloads and their data.

In this tutorial, you configure a backup plan for your GKE cluster to performbackups of all workloads, including Secrets and Volumes, every day at 3 AM.To ensure efficient storage management, backups older than three days would be automatically deleted.

To configureBackup plans,follow these steps:

Enable the Backup for GKE feature for your cluster:

gcloudcontainerclustersupdate${KUBERNETES_CLUSTER_PREFIX}-cluster\--project=${PROJECT_ID}\--location=${REGION}\--update-addons=BackupRestore=ENABLED

Create a backup plan with a daily schedule for all namespaces within the cluster:

gcloudbetacontainerbackup-restorebackup-planscreate${KUBERNETES_CLUSTER_PREFIX}-cluster-backup\--project=${PROJECT_ID}\--location=${REGION}\--cluster="projects/${PROJECT_ID}/locations/${REGION}/clusters/${KUBERNETES_CLUSTER_PREFIX}-cluster"\--all-namespaces\--include-secrets\--include-volume-data\--cron-schedule="0 3 * * *"\--backup-retain-days=3

The command uses the relevant environment variables at runtime.

The cluster name's format is relative to your project and region as follows:

projects/PROJECT_ID/locations/REGION/clusters/CLUSTER_NAME

When prompted, typey.The output is similar to the following:

Create request issued for: [qdrant-cluster-backup]Waiting for operation [projects/PROJECT_ID/locations/us-central1/operations/operation-1706528750815-610142ffdc9ac-71be4a05-f61c99fc] to complete...⠹

This operation might take a few minutes to complete successfully. After theexecution is complete, the output is similar to the following:

Created backup plan [qdrant-cluster-backup].

You can see your newly created backup planqdrant-cluster-backup listed onthe Backup for GKE console.
Go to Backup for GKE

If you want to restore the saved backup configurations, seeRestore a backup.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

The easiest way to avoid billing is to delete the project you created forthis tutorial.

Caution: Deleting a project has the following effects:

Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as anappspot.com URL, delete selected resources inside the project instead of deleting the whole project.

If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

Delete a Google Cloud project:

gcloud projects deletePROJECT_ID

If you deleted the project, your clean up is complete. If you didn't delete theproject, proceed to delete the individual resources.

Delete individual resources

Set environment variables.

exportPROJECT_ID=${PROJECT_ID}exportKUBERNETES_CLUSTER_PREFIX=qdrantexportREGION=us-central1

Run theterraform destroy command:

exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=terraform/FOLDERdestroy\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

ReplaceFOLDER with eithergke-autopilot orgke-standard, depending on the type of GKE cluster you created.

When prompted, typeyes.

Find all unattached disks:

exportdisk_list=$(gcloudcomputediskslist--filter="-users:* AND labels.name=${KUBERNETES_CLUSTER_PREFIX}-cluster"--format"value[separator=|](name,region)")

Delete the disks:

foriin$disk_list;dodisk_name=$(echo$i|cut-d'|'-f1)disk_region=$(echo$i|cut-d'|'-f2|sed's|.*/||')echo"Deleting$disk_name"gcloudcomputedisksdelete$disk_name--region$disk_region--quietdone

Delete the GitHub repository:
```
rm-r~/kubernetes-engine-samples/
```

What's next

ExploreQdrant on GKE marketplace.
ExploreQdrant open source software.
Try out theQdrant operatorthat offersAPI keys management,TLS support withcertificate management,andbackup scheduling.
Learn about thebest practices for deploying databases on GKE.
Discover solutions for runningdata-intensive workloads with GKE.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-10-30 UTC.

Movatterモバイル変換

Deploy a Qdrant vector database on GKE

Benefits

Objectives

Deployment architecture

Architecture diagram

Costs

Before you begin

Set up your environment

Create your cluster infrastructure

Autopilot

Standard

Connect to the cluster

Deploy the Qdrant database to your cluster

Autopilot

Standard

Run queries with Vertex AI Colab Enterprise notebook

Create a runtime template

Create a runtime

Import the notebook

Connect to the runtime and run queries

View Prometheus metrics for your cluster

Back up your cluster configuration

Clean up

Delete the project

Delete individual resources

What's next