Deploy a PostgreSQL vector database on GKE

This tutorial shows you how to deploy aPostgreSQLvector database cluster on Google Kubernetes Engine (GKE).

PostgreSQL comes with a range of modules and extensionsthat extend the database's functionality. In this tutorial, you install thepgvector extension on an existing PostgreSQL cluster deployed to GKE. ThePgvector extension lets you store vectors in the database tables by adding vectortypes to PostgreSQL. Pgvector also provides similarity searches by running commonSQL queries.

We simplify the PGvector extension deployment by first deploying theCloudnativePGoperator, as the operator provides a bundled version of the extension.

This tutorial is intended for cloud platformadministrators and architects,ML engineers, and MLOps(DevOps) professionals interested in deploying PostgreSQLdatabase clusters on GKE.

Objectives

In this tutorial, you learn how to:

  • Deploy GKE infrastructure for PostgreSQL.
  • Install pgvector extension on the PostgreSQL cluster deployed toGKE.
  • Deploy and configure the CloudNativePG PostgreSQL operator with Helm.
  • Upload a demo dataset and run search queries with Jupyter Notebook.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use thepricing calculator.

New Google Cloud users might be eligible for afree trial.

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, seeClean up.

Before you begin

In this tutorial, you useCloud Shell torun commands. Cloud Shell is a shell environment for managingresources hosted on Google Cloud. It comes preinstalled with theGoogle Cloud CLI,kubectl,Helm and Terraformcommand-line tools. If you don't use Cloud Shell, you must install the Google Cloud CLI.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. Install the Google Cloud CLI.

    Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.
  3. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  4. Toinitialize the gcloud CLI, run the following command:

    gcloudinit
  5. Create or select a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.
    • Create a Google Cloud project:

      gcloud projects createPROJECT_ID

      ReplacePROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set projectPROJECT_ID

      ReplacePROJECT_ID with your Google Cloud project name.

  6. Verify that billing is enabled for your Google Cloud project.

  7. Enable the Cloud Resource Manager, Compute Engine, GKE, and IAMService Account Credentials APIs:

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    gcloudservicesenablecloudresourcemanager.googleapis.com compute.googleapis.com container.googleapis.com iamcredentials.googleapis.com
  8. Install the Google Cloud CLI.

    Note: If you installed the gcloud CLI previously, make sure you have the latest version by runninggcloud components update.
  9. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  10. Toinitialize the gcloud CLI, run the following command:

    gcloudinit
  11. Create or select a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.
    • Create a Google Cloud project:

      gcloud projects createPROJECT_ID

      ReplacePROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set projectPROJECT_ID

      ReplacePROJECT_ID with your Google Cloud project name.

  12. Verify that billing is enabled for your Google Cloud project.

  13. Enable the Cloud Resource Manager, Compute Engine, GKE, and IAMService Account Credentials APIs:

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    gcloudservicesenablecloudresourcemanager.googleapis.com compute.googleapis.com container.googleapis.com iamcredentials.googleapis.com
  14. Grant roles to your user account. Run the following command once for each of the following IAM roles:roles/compute.securityAdmin, roles/compute.viewer, roles/container.clusterAdmin, roles/container.admin, roles/iam.serviceAccountAdmin, roles/iam.serviceAccountUser

    gcloudprojectsadd-iam-policy-bindingPROJECT_ID--member="user:USER_IDENTIFIER"--role=ROLE

    Replace the following:

    • PROJECT_ID: Your project ID.
    • USER_IDENTIFIER: The identifier for your user account. For example,myemail@example.com.
    • ROLE: The IAM role that you grant to your user account.

Set up your environment

To set up your environment with Cloud Shell, follow these steps:

  1. Set environment variables for your project, region, and a Kubernetescluster resource prefix:

    exportPROJECT_ID=PROJECT_IDexportKUBERNETES_CLUSTER_PREFIX=postgresexportREGION=us-central1
    • ReplacePROJECT_ID with your Google Cloudproject ID.

    This tutorial uses theus-central1 region.

  2. Clone the sample code repository from GitHub:

    gitclonehttps://github.com/GoogleCloudPlatform/kubernetes-engine-samples
  3. Navigate to thepostgres-pgvector directory:

    cdkubernetes-engine-samples/databases/postgres-pgvector

Create your cluster infrastructure

In this section, you run a Terraform script to create a private, highly-available,regional GKE cluster to deploy your PostgreSQL database.

You can choose to deploy PostgreSQL using aStandard or Autopilot cluster.Each has its own advantages and different pricing models.

Autopilot

To deploy the Autopilot cluster infrastructure, run the following commands in the Cloud Shell:

exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=../postgresql-cloudnativepg/terraform/gke-autopilotinitterraform-chdir=../postgresql-cloudnativepg/terraform/gke-autopilotapply\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

GKE replaces the following variables at runtime:

  • GOOGLE_OAUTH_ACCESS_TOKEN uses thegcloud auth print-access-tokencommand to retrieve an access token that authenticates interactions withvarious Google Cloud APIs
  • PROJECT_ID,REGION, andKUBERNETES_CLUSTER_PREFIX are the environmentvariables defined in theSet up your environment section and assignedto the new relevant variables for the Autopilot cluster you are creating.

When prompted, typeyes.

Terraform creates the following resources:

  • A custom VPC network and private subnet for the Kubernetes nodes.
  • A Cloud Router to access the internet through Network Address Translation (NAT).
  • A private GKE cluster in theus-central1 region.
  • AServiceAccount with logging and monitoring permissions for the cluster.
  • Google Cloud Managed Service for Prometheus configuration forcluster monitoring and alerting.

The output is similar to the following:

...Apply complete! Resources: 11 added, 0 changed, 0 destroyed....

Standard

To deploy the Standard cluster infrastructure, run the following commands in the Cloud Shell:

exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=../postgresql-cloudnativepg/terraform/gke-standardinitterraform-chdir=../postgresql-cloudnativepg/terraform/gke-standardapply\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

GKE replaces the following variables at runtime:

  • GOOGLE_OAUTH_ACCESS_TOKEN uses thegcloud auth print-access-tokencommand to retrieve an access token that authenticates interactions withvarious Google Cloud APIs.
  • PROJECT_ID,REGION, andKUBERNETES_CLUSTER_PREFIX are the environment variablesdefined inSet up your environment section and assigned to the newrelevant variables for the Standard cluster that you are creating.

When prompted, typeyes. It might take several minutes for these commands tocomplete and for the cluster to show a ready status.

Terraform creates the following resources:

  • A custom VPC network and private subnet for the Kubernetes nodes.
  • A Cloud Router to access the internet through Network Address Translation (NAT).
  • A private GKE cluster in theus-central1 region with autoscaling enabled(one to two nodes per zone).
  • AServiceAccount with logging and monitoring permissions for the cluster.
  • Google Cloud Managed Service for Prometheus configuration for cluster monitoring and alerting.

The output is similar to the following:

...Apply complete! Resources: 14 added, 0 changed, 0 destroyed....

Connect to the cluster

Configurekubectl to fetch credentials and communicate with your new GKE cluster:

gcloudcontainerclustersget-credentials\${KUBERNETES_CLUSTER_PREFIX}-cluster--location${REGION}--project${PROJECT_ID}

Deploy the CloudNativePG operator

Deploy the CloudNativePG to your Kubernetes cluster using a Helm chart:

  1. Check the version of Helm:

    helmversion

    Update the version if it's older than 3.13:

    curlhttps://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3|bash
  2. Add the CloudNativePG operator Helm Chart repository:

    helmrepoaddcnpghttps://cloudnative-pg.github.io/charts
  3. Deploy the CloudNativePG operator using the Helm command-line tool:

    helmupgrade--installcnpg\--namespacecnpg-system\--create-namespace\cnpg/cloudnative-pg

    The output is similar to the following:

    Release "cnpg" does not exist. Installing it now.NAME: cnpgLAST DEPLOYED: Fri Oct 13 13:52:36 2023NAMESPACE: cnpg-systemSTATUS: deployedREVISION: 1TEST SUITE: None...

Deploy the PostgreSQL vector database

In this section, you deploy the PostgreSQL vector database.

  1. Create a namespacepg-ns for the database:

    kubectlcreatenspg-ns
  2. Apply the manifest to deploy PostgreSQL cluster. The cluster manifest enablesthe pgvector extension.

    kubectlapply-npg-ns-fmanifests/01-basic-cluster/postgreSQL_cluster.yaml

    ThepostgreSQL_cluster.yaml manifest describes the Deployment:

    apiVersion:postgresql.cnpg.io/v1kind:Clustermetadata:name:gke-pg-clusterspec:description:"StandardGKEPostgreSQLcluster"imageName:ghcr.io/cloudnative-pg/postgresql:16.2enableSuperuserAccess:trueinstances:3startDelay:300primaryUpdateStrategy:unsupervisedpostgresql:pg_hba:-host all all 10.48.0.0/20 md5bootstrap:initdb:postInitTemplateSQL:-CREATE EXTENSION IF NOT EXISTS vector;database:appstorage:storageClass:premium-rwosize:2Giresources:requests:memory:"1Gi"cpu:"1000m"limits:memory:"1Gi"cpu:"1000m"affinity:enablePodAntiAffinity:truetolerations:-key:cnpg.io/clustereffect:NoSchedulevalue:gke-pg-clusteroperator:EqualadditionalPodAffinity:preferredDuringSchedulingIgnoredDuringExecution:-weight:1podAffinityTerm:labelSelector:matchExpressions:-key:app.componentoperator:Invalues:-"pg-cluster"topologyKey:topology.kubernetes.io/zonemonitoring:enablePodMonitor:true
  3. Check the status of the cluster:

    kubectlgetcluster-npg-ns--watch

    Wait for the output to show a status ofCluster in healthy state before you move to the next step.

Run queries with a Vertex AI Colab Enterprise notebook

In this section, you upload vectors into a PostgreSQL table and runsemantic search queries by using SQL syntax.

You connect to your PostgreSQL database by using Colab Enterprise.You use a dedicated runtime template to deploy to thepostgres-vpc, so thenotebook can communicate with resources in the GKE cluster.

For more information about Vertex AI Colab Enterprise, seeColab Enterprise documentation.

Create a runtime template

To create a Colab Enterprise runtime template:

  1. In the Google Cloud console, go to the Colab EnterpriseRuntime Templatespage and make sure your project is selected:

    Go to Runtime Templates

  2. ClickNew Template. TheCreate new runtime template page appears.

  3. In theRuntime basics section:

    • In theDisplay name field, enterpgvector-connect.
    • In theRegion drop-down list, selectus-central1. It's the same region as your GKE cluster.
  4. In theConfigure compute section:

    • In theMachine type drop-down list, selecte2-standard-2.
    • In theDisk size field, enter30.
  5. In theNetworking and security section:

    • In theNetwork drop-down list, select the network where yourGKE cluster resides.
    • In theSubnetwork drop-down list, select a corresponding subnetwork.
    • Clear theEnable public internet access checkbox.
  6. To finish creating the runtime template, clickCreate. Your runtime templateappears in the list on theRuntime templates tab.

Create a runtime

To create a Colab Enterprise runtime:

  1. In the runtime templates list for the template you just created, in theActions column,click and thenclickCreate runtime. TheCreate Vertex AI Runtime pane appears.

  2. To create a runtime based on your template, clickCreate.

  3. On theRuntimes tab that opens, wait for the status to transition toHealthy.

Import the notebook

To import the notebook in Colab Enterprise:

  1. Go to theMy Notebooks tab and clickImport. TheImport notebookspane appears.

  2. InImport source, selectURL.

  3. UnderNotebook URLs, enter the following link:

    https://raw.githubusercontent.com/epam/kubernetes-engine-samples/internal_lb/databases/postgres-pgvector/manifests/02-notebook/vector-database.ipynb
  4. ClickImport.

Connect to the runtime and run queries

To connect to the runtime and run queries:

  1. In the notebook, next to theConnect button, clickAdditional connection options.TheConnect to Vertex AI Runtime pane appears.

  2. SelectConnect to a runtime and then selectConnect to an existing Runtime.

  3. Select the runtime that you launched and clickConnect.

  4. To run the notebook cells, click theRun cell button next to each code cell.

The notebook contains both code cells and text that describes each code block. Runninga code cell executes its commands and displays an output. You can run the cellsin order, or run individual cells as needed.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

The easiest way to avoid billing is to delete the project you created forthis tutorial.

Caution: Deleting a project has the following effects:
  • Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
  • Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as anappspot.com URL, delete selected resources inside the project instead of deleting the whole project.

If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

Delete a Google Cloud project:

gcloud projects deletePROJECT_ID

If you deleted the project, your clean up is complete. If you didn't delete theproject, proceed to delete the individual resources.

Delete individual resources

  1. Set environment variables.

    exportPROJECT_ID=${PROJECT_ID}exportKUBERNETES_CLUSTER_PREFIX=postgresexportREGION=us-central1
  2. Run theterraform destroy command:

    exportGOOGLE_OAUTH_ACCESS_TOKEN=$(gcloudauthprint-access-token)terraform-chdir=../postgresql-cloudnativepg/terraform/FOLDERdestroy\-varproject_id=${PROJECT_ID}\-varregion=${REGION}\-varcluster_prefix=${KUBERNETES_CLUSTER_PREFIX}

    ReplaceFOLDER with eithergke-autopilot orgke-standard, depending on thetype of GKE cluster you created.

    When prompted, typeyes.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-10-30 UTC.