Perform semantic search and retrieval-augmented generation

This tutorial guides you through the end-to-end process of creatingand usingtext embeddingsfor semantic search andretrieval-augmented generation (RAG).

This tutorial covers the following tasks:

This tutorial uses the BigQuery public tablepatents-public-data.google_patents_research.publications.

Required roles

To run this tutorial, you need the following Identity and Access Management (IAM)roles:

  • Create and use BigQuery datasets, connections, and models:BigQuery Admin (roles/bigquery.admin).
  • Grant permissions to the connection's service account: Project IAM Admin(roles/resourcemanager.projectIamAdmin).

These predefined roles contain the permissions required to perform the tasks inthis document. To see the exact permissions that are required, expand theRequired permissions section:

Required permissions

  • Create a dataset:bigquery.datasets.create
  • Create, delegate, and use a connection:bigquery.connections.*
  • Set the default connection:bigquery.config.*
  • Set service account permissions:resourcemanager.projects.getIamPolicy andresourcemanager.projects.setIamPolicy
  • Create a model and run inference:
    • bigquery.jobs.create
    • bigquery.models.create
    • bigquery.models.getData
    • bigquery.models.updateData
    • bigquery.models.updateMetadata

You might also be able to get these permissions withcustom roles or otherpredefined roles.

Costs

In this document, you use the following billable components of Google Cloud:

  • BigQuery ML: You incur costs for the data that you process in BigQuery.
  • Vertex AI: You incur costs for calls to the Vertex AI service that's represented by the remote model.

To generate a cost estimate based on your projected usage, use thepricing calculator.

New Google Cloud users might be eligible for afree trial.

For more information about BigQuery pricing, seeBigQuery pricing inthe BigQuery documentation.

For more information about Vertex AI pricing, see theVertex AI pricingpage.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator role (roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.create permission.Learn how to grant roles.
    Note: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.

    Go to project selector

  2. Verify that billing is enabled for your Google Cloud project.

  3. Enable the BigQuery, BigQuery Connection, and Vertex AI APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    Enable the APIs

Create a dataset

Create a BigQuery dataset to store your ML model.

Console

  1. In the Google Cloud console, go to theBigQuery page.

    Go to the BigQuery page

  2. In theExplorer pane, click your project name.

  3. ClickView actions > Create dataset

  4. On theCreate dataset page, do the following:

    • ForDataset ID, enterbqml_tutorial.

    • ForLocation type, selectMulti-region, and then selectUS (multiple regions in United States).

    • Leave the remaining default settings as they are, and clickCreate dataset.

bq

To create a new dataset, use thebq mk commandwith the--location flag. For a full list of possible parameters, see thebq mk --dataset commandreference.

  1. Create a dataset namedbqml_tutorial with the data location set toUSand a description ofBigQuery ML tutorial dataset:

    bq --location=US mk -d \ --description "BigQuery ML tutorial dataset." \ bqml_tutorial

    Instead of using the--dataset flag, the command uses the-d shortcut.If you omit-d and--dataset, the command defaults to creating adataset.

  2. Confirm that the dataset was created:

    bqls

API

Call thedatasets.insertmethod with a defineddataset resource.

{"datasetReference":{"datasetId":"bqml_tutorial"}}

BigQuery DataFrames

Before trying this sample, follow the BigQuery DataFrames setup instructions in theBigQuery quickstart using BigQuery DataFrames. For more information, see theBigQuery DataFrames reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up ADC for a local development environment.

importgoogle.cloud.bigquerybqclient=google.cloud.bigquery.Client()bqclient.create_dataset("bqml_tutorial",exists_ok=True)

Create the remote model for text embedding generation

Create a remote model that represents a hosted Vertex AItext embedding generation model:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, run the following statement:

    CREATEORREPLACEMODEL`bqml_tutorial.embedding_model`REMOTEWITHCONNECTIONDEFAULTOPTIONS(ENDPOINT='text-embedding-005');

    The query takes several seconds to complete, after which the modelembedding_model can be accessed through theExplorer pane.Because the query uses aCREATE MODEL statement to create a model, thereare no query results.

Generate text embeddings

Generate text embeddings from patent abstracts using theAI.GENERATE_EMBEDDING function,and then write them to a BigQuery table so that they can besearched.

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, run the following statement:

    CREATEORREPLACETABLE`bqml_tutorial.embeddings`ASSELECT*FROMAI.GENERATE_EMBEDDING(MODEL`bqml_tutorial.embedding_model`,(SELECT*,abstractAScontentFROM`patents-public-data.google_patents_research.publications`WHERELENGTH(abstract)>0ANDLENGTH(title)>0ANDcountry='Singapore'))WHERELENGTH(status)=0;

This query takes approximately 5 minutes to complete.

Embedding generation using theAI.GENERATE_EMBEDDING functionmight fail due to Vertex AI LLMquotasor service unavailability. Error details are returned in thestatus column. An emptystatuscolumn indicates successful embedding generation.

For alternative text embedding generation methods in BigQuery,see theEmbed text with pretrained TensorFlow models tutorial.

Create a vector index

If you create a vector index on an embedding column, a vector search performedon that column uses theApproximate Nearest Neighborsearch technique. This technique improves vector search performance, with thetrade-off of reducingrecalland so returning more approximate results.

To create a vector index, use theCREATE VECTOR INDEXdata definition language (DDL) statement:

  1. Go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, run the following SQL statement:

    CREATEORREPLACEVECTORINDEXmy_indexON`bqml_tutorial.embeddings`(embedding)OPTIONS(index_type='IVF',distance_type='COSINE',ivf_options='{"num_lists":500}')

Creating a vector index typically takes only a few seconds. It takes another2 or 3 minutes for the vector index to be populated and ready to use.

Verify vector index readiness

The vector index is populated asynchronously. You can check whether the index isready to be used by querying theINFORMATION_SCHEMA.VECTOR_INDEXES viewand verifying that thecoverage_percentage column value is greater than0and thelast_refresh_time column value isn'tNULL.

  1. Go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, run the following SQL statement:

    SELECTtable_name,index_name,index_status,coverage_percentage,last_refresh_time,disable_reasonFROM`PROJECT_ID.bqml_tutorial.INFORMATION_SCHEMA.VECTOR_INDEXES`

    ReplacePROJECT_ID with your project ID.

Perform a text similarity search using the vector index

Use theVECTOR_SEARCH functionto search for relevant patents that match embeddings generated from atext query.

Thetop_k argument determines the number of matches to return,in this case five. Thefraction_lists_to_search option determines thepercentage of vector index lists to search.The vector index you created has 500 lists, sothefraction_lists_to_search value of.01 indicates that this vector searchscans five of those lists. A lowerfraction_lists_to_search value as shown hereprovides lowerrecalland faster performance. For more information about vector index lists, seethenum_listsvector index option.

The model you use to generate the embeddings in this query must bethe same as the one you use to generate the embeddings in the table you arecomparing against, otherwise the search results won't be accurate.

  1. Go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, run the following SQL statement:

    SELECTquery.query,base.publication_number,base.title,base.abstractFROMVECTOR_SEARCH(TABLE`bqml_tutorial.embeddings`,'embedding',(SELECTembedding,contentASqueryFROMAI.GENERATE_EMBEDDING(MODEL`bqml_tutorial.embedding_model`,(SELECT'improving password security'AScontent))),top_k=>5,options=>'{"fraction_lists_to_search": 0.01}')

    The output is similar to the following:

    +-----------------------------+--------------------+-------------------------------------------------+-------------------------------------------------+|            query            | publication_number |                       title                     |                      abstract                   |+-----------------------------+--------------------+-------------------------------------------------+-------------------------------------------------+| improving password security | SG-120868-A1       | Data storage device security method and a...    | Methods for improving security in data stora... || improving password security | SG-10201610585W-A  | Passsword management system and process...      | PASSSWORD MANAGEMENT SYSTEM AND PROCESS ...     || improving password security | SG-148888-A1       | Improved system and method for...               | IMPROVED SYSTEM AND METHOD FOR RANDOM...        || improving password security | SG-194267-A1       | Method and system for protecting a password...  | A system for providing security for a...        || improving password security | SG-120868-A1       | Data storage device security...                 | Methods for improving security in data...       |+-----------------------------+--------------------+-------------------------------------------------+-------------------------------------------------+

Create the remote model for text generation

Create a remote model that represents a hosted Vertex AItext generation model:

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, run the following statement:

    CREATEORREPLACEMODEL`bqml_tutorial.text_model`REMOTEWITHCONNECTIONDEFAULTOPTIONS(ENDPOINT='gemini-2.0-flash-001');

    The query takes several seconds to complete, after which the modeltext_model can be accessed through theExplorer pane.Because the query uses aCREATE MODEL statement to create a model, thereare no query results.

Generate text augmented by vector search results

Feed the search results as prompts to generate text with theAI.GENERATE_TEXT function

  1. In the Google Cloud console, go to theBigQuery page.

    Go to BigQuery

  2. In the query editor, run the following statement:

    SELECTresultASgenerated,promptFROMAI.GENERATE_TEXT(MODEL`bqml_tutorial.text_model`,(SELECTCONCAT('Propose some project ideas to improve user password security using the context below: ',STRING_AGG(FORMAT("patent title: %s, patent abstract: %s",base.title,base.abstract),',\n'))ASprompt,FROMVECTOR_SEARCH(TABLE`bqml_tutorial.embeddings`,'embedding',(SELECTembedding,contentASqueryFROMAI.GENERATE_EMBEDDING(MODEL`bqml_tutorial.embedding_model`,(SELECT'improving password security'AScontent))),top_k=>5,options=>'{"fraction_lists_to_search": 0.01}')),STRUCT(600ASmax_output_tokens));

    The output is similar to the following:

    +------------------------------------------------+------------------------------------------------------------+|            generated                           | prompt                                                     |+------------------------------------------------+------------------------------------------------------------+| These patents suggest several project ideas to | Propose some project ideas to improve user password        || improve user password security.  Here are      | security using the context below: patent title: Active     || some, categorized by the patent they build     | new password entry dialog with compact visual indication   || upon:                                          | of adherence to password policy, patent abstract:          ||                                                | An active new password entry dialog provides a compact     || **I. Projects based on "Active new password    | visual indication of adherence to password policies. A     || entry dialog with compact visual indication of | visual indication of progress towards meeting all          || adherence to password policy":**               | applicable password policies is included in the display    ||                                                | and updated as new password characters are being...        |+------------------------------------------------+------------------------------------------------------------+

Clean up

    Caution: Deleting a project has the following effects:
    • Everything in the project is deleted. If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.
    • Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as anappspot.com URL, delete selected resources inside the project instead of deleting the whole project.

    If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects can help you avoid exceeding project quota limits.

  1. In the Google Cloud console, go to theManage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then clickDelete.
  3. In the dialog, type the project ID, and then clickShut down to delete the project.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-16 UTC.