Introduction to embeddings and vector search

This document provides an overview of embeddings and vector search inBigQuery. Vector search is a technique to compare similar objects using embeddings, and itis used to power Google products, including Google Search,YouTube, and Google Play. You can use vector search to performsearches at scale. When you usevector indexeswith vector search, you can take advantage of foundational technologies likeinverted file indexing (IVF) and theScaNN algorithm.

Vector search is built on embeddings. Embeddings are high-dimensional numericalvectors that represent a given entity, like a piece of text or an audio file.Machine learning (ML) models use embeddings to encode semantics about suchentities to make it easier to reason about and compare them. For example, acommon operation in clustering, classification, and recommendation models is tomeasure the distance between vectors in anembedding space to find itemsthat are most semantically similar.

This concept of semantic similarity and distance in an embedding space isvisually demonstrated when you consider how different items might be plotted.For example, terms likecat,dog, andlion, which all represent types ofanimals, are grouped close together in this space due to their shared semanticcharacteristics. Similarly, terms likecar,truck, and the more generic termvehicle would form another cluster. This is shown in the following image:

Semantically similar concepts, like _cat_, _dog_, and _lion_, or _car_, _truck_, and _vehicle_, are closetogether in the embedding space.

You can see that the animal and vehicle clusters are positioned far apartfrom each other. The separation between the groups illustrates the principlethat the closer objects are in the embedding space, the more semanticallysimilar they are, and greater distances indicate greater semantic dissimilarity.

Use cases

The combination of embedding generation and vector search enables manyinteresting use cases. Some possible use cases are as follows:

Retrieval-augmented generation (RAG):Parse documents, perform vector search on content, and generatesummarized answers to natural language questions using Gemini models, allwithin BigQuery. For a notebook that illustrates thisscenario, seeBuild a Vector Search application using BigQuery DataFrames.
Recommending product substitutes or matching products: Enhanceecommerce applications by suggesting product alternatives based on customerbehavior and product similarity.
Log analytics: Help teams proactively triage anomalies in logs andaccelerate investigations. You can also use this capability to enrichcontext for LLMs, in order to improve threat detection, forensics, andtroubleshooting workflows. For a notebook that illustrates thisscenario, seeLog Anomaly Detection & Investigation with Text Embeddings + BigQuery Vector Search.
Clustering and targeting: Segment audiences with precision. For example,a hospital chain could cluster patients using natural language notes andstructured data, or a marketer could target ads based on query intent. For a notebook that illustrates thisscenario, seeCreate-Campaign-Customer-Segmentation.
Entity resolution and deduplication: Cleanse and consolidate data.For example, an advertising company could deduplicate personallyidentifiable information (PII) records, or areal estate company could identify matching mailing addresses.

Generate embeddings

The following sections describe the functions that BigQueryoffers to help you generate or work with embeddings.

Generate single embeddings

Preview

This product or feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA products and features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Note: To provide feedback or request support for this feature during thepreview, contact bqml-feedback@google.com.

You can use theAI.EMBED functionwith Vertex AI embedding models to generate a single embeddingof your input.

TheAI.EMBED function supports the following types of input:

Text data.
Image data represented byObjectRefvalues. (Preview)
Image data represented byObjectRefRuntimevalues.

Generate a table of embeddings

You can use theAI.GENERATE_EMBEDDINGto create a table that has embeddings for all of the data in a column of yourinput table. For all types of supported models,AI.GENERATE_EMBEDDINGworks with structured data instandard tables. For multimodalembedding models,AI.GENERATE_EMBEDDING also works with visual contentfrom either standard tablecolumns that containObjectRef values,or fromobject tables.

For remote models, all inference occurs in Vertex AI. For othermodel types, all inference occurs in BigQuery. The results arestored in BigQuery.

Use the following topics to try embedding generation inBigQuery ML:

Generatetext,images, orvideo by using theAI.GENERATE_EMBEDDING function.
Generate and search multimodal embeddings
Perform semantic search and retrieval-augmented generation

Autonomous embedding generation

Preview

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Note: To give feedback or request support for this feature, contact bq-vector-search@google.com

You can useautonomous embedding generationto simplify the process of creating, maintaining, and querying embeddings.BigQuery maintains a column of embeddings on your tablebased on a source column. When you add or modify data in the source column,BigQuery automatically generates or updates the embeddingcolumn for that data by using a Vertex AI embedding model.This is helpful if you want to let BigQuery maintain yourembeddings when your source data is updated regularly.

Search

The following search functions are available:

VECTOR_SEARCH:Perform a vector search by using SQL.
AI.SEARCH(Preview):Search for results that are close to a string that you provide. You can usethis function if your table hasautonomous embedding generation enabled.
AI.SIMILARITY(Preview):Compare two inputs by computing thecosine similarity betweentheir embeddings. This function works well if you want to perform a smallnumber of comparisons and you haven't precomputed any embeddings. You shoulduseVECTOR_SEARCH when performance is critical and you're working with alarge number of embeddings.Compare their functionalityto choose the best function for your use case.

Optionally, you can create avector index byusing theCREATE VECTOR INDEX statement.When a vector index is used, theVECTOR_SEARCH andAI.SEARCH functions usetheApproximate Nearest Neighborsearch technique to improve vector search performance, with the trade-off ofreducingrecalland so returning more approximate results. Without a vector index, thesefunctions usebrute force search tomeasure distance for every record. You can also choose to use brute force to getexact results even when a vector index is available.

Pricing

TheVECTOR_SEARCH andAI.SEARCH functions and theCREATE VECTOR INDEXstatement useBigQuery compute pricing.

VECTOR_SEARCH andAI.SEARCH functions: You are charged for similarity search, using on-demand or editions pricing.
- On-demand: You are charged for the amount of bytes scanned in the basetable, the index, and the search query.
- Editions pricing: You are charged for the slots required to completethe job within your reservation edition. Larger, more complexsimilarity calculations incur more charges.
  Note: Using an index isn't supported in Standard editions.
CREATE VECTOR INDEX statement: There is no charge for the processing required to build and refresh your vector indexes as long as the total size of the indexed table data is below your per-organizationlimit. To support indexing beyond this limit, you mustprovide your own reservation for handling the index management jobs.

Storage is also a consideration for embeddings and indexes. The amount of bytesstored as embeddings and indexes are subject toactive storage costs.

Vector indexes incur storage costs when they are active.
You can find the index storage size by using theINFORMATION_SCHEMA.VECTOR_INDEXES view.If the vector index is not yet at 100% coverage, you are still charged forwhatever has been indexed. You can check index coverage by using theINFORMATION_SCHEMA.VECTOR_INDEXES view.

Quotas and limits

For more information, seeVector index limits andgenerative AI function limits.

Limitations

Queries that contain theVECTOR_SEARCH orAI.SEARCH function aren'taccelerated byBigQuery BI Engine.

What's next

Learn more aboutcreating a vector index.
Learn how to perform a vector search using theVECTOR_SEARCHfunction.
Learn how to perform semantic search using theAI.SEARCH function.
Learn more aboutautonomous embedding generation.
Try theSearch embeddings with vector searchtutorial to learn how to create a vector index, and then do a vectorsearch for embeddings both with and without the index.
Try thePerform semantic search and retrieval-augmented generationtutorial to learn how to do the following tasks:
- Generate text embeddings.
- Create a vector index on the embeddings.
- Perform a vector search with the embeddings to search for similar text.
- Perform retrieval-augmented generation (RAG) by using vector searchresults to augment the prompt input and improve results.
Try theParse PDFs in a retrieval-augmented generation pipelinetutorial to learn how to create a RAG pipeline based on parsed PDF content.
You can also perform vector searches by using BigQuery DataFrames in Python.For a notebook that illustrates this approach, seeBuild a Vector Search application using BigQuery DataFrames.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.

Movatterモバイル変換