Vector Search

Vector Search is a powerful vector search engine built on groundbreakingtechnology developed by Google Research. Leveraging theScaNNalgorithm, Vector Search lets you build next-generation search andrecommendation systems as well as generative AI applications.

You can benefit from the very same research and technology that power core Googleproducts, including Google Search, YouTube, and Google Play.This means you get the scalability, availability, and performance that'strusted to handle massive datasets and deliver lightning-fast results at aglobal scale. With Vector Search, you have an enterprise-gradesolution for implementing cutting-edge semantic search capabilities in your ownapplications.

Note: $1,000 in free trial credits are available for new users for use withVector Search. These credits are available for a one-year periodbeginning from new account signup with Vertex AI API starting fromOctober 1, 2024.
Vector Search live demo

Blog:Multimodal search with Vector Search

Next 24 Infinite Nature Demo

Next 24 Infinite Nature Demo

Infinite Fleurs: Discover AI-assisted creativity in full bloom

Infinite Fleurs: Discover AI-assisted creativity in full bloom

Vector Search live demo

Experience multimodal AI with manga ONE PIECE

Get Started

Vector Search interactive demo:Check out the live demo for a realistic example of what vector search technologycan do and get a headstart with Vector Search.

Vector Search quickstart: TryVector Search in 30 minutes by building, deploying, and querying aVector Search index using a sample dataset. This tutorial covers setup, datapreparation, index creation, deployment, querying, and cleanup.

Before you begin: Prepare your embeddings bychoosing and training a model, and preparing your data. Then, choose a publicor private endpoint to deploy your query index to.

Vector Search pricing and pricing calculator:Vector Search pricing includes the cost of virtual machines usedto host deployed indexes, as well as expenses for building and updating indexes.Even a minimal setup (under $100 per month) can accommodate high throughput formoderate-sized use cases. To estimate your monthly costs:

  1. Go toGoogle Cloud's pricing calculator.
  2. ClickAdd to estimate.
  3. Search for Vertex AI.
  4. Click theVertex AI button.
  5. ChooseVertex AI Vector Search from theService type drop-down.
  6. Keep the default settings or configure your own. The estimated cost per monthis shown in theCost details panel.
Note: Dataset size and restricts (filtering) affect shard count. A highnumber of restricts increases memory usage and causes more shards to be createdin order to distribute memory load.

Documentation

Use cases and blogs

Vector search technology is becoming a central hub for businesses using AI.Similar to how relational databases function in IT systems, it connects variousbusiness elements like documents, content, products, users, events, and otherentities based on their relevance. Beyond searching conventional media likedocuments and images, Vector Search can also power intelligentrecommendations, match business problems with solutions, and even link IoT signalsto monitoring alerts. It's a versatile tool that's essential for navigating thegrowing landscape of AI-enabled enterprise data.

Search and information retrieval

Search / Information Retrieval

Vector Search for Recommendation Systems

Recommendation
Systems

How Vertex AI vector search helps unlock high-performance gen AI apps: Vector Search powers diverse applications, including ecommerce, RAG systems, and recommendation engines, alongside chatbots, multimodal search, and more. Hybrid search further enhances results for niche terms. Customers like Bloomreach, eBay, and Mercado Libre use Vertex AI for its performance, scalability, and cost-effectiveness, achieving benefits like faster search and increased conversions.

eBay uses Vector Search for recommendations: Highlights how eBay uses Vector Search for its recommendation system. This technology allows eBay to find similar products within its extensive catalog, improving the user experience.

Mercari leverages Google's vector search technology to create a new marketplace: Explains how Mercari uses Vector Search to improve its new marketplace platform. Vector Search powers the platform's recommendations, helping users find relevant products more effectively.

Vertex AI Embeddings for Text: Grounding LLMs made easy: Focuses on grounding LLMs using Vertex AI Embeddings for text data. Vector Search plays an important role in finding relevant text passages that ensure the model's responses are grounded in factual information.

What is Multimodal Search: "LLMs with vision" change businesses: Discusses Multimodal Search, which combines LLMs with visual understanding. It explains how Vector Search processes and compares both text and image data, allowing for more comprehensive search experiences.

Unlock multimodal search at scale: Combine text & image power with Vertex AI: Describes building a multimodal search engine with Vertex AI that combines text and image search using a weighted Rank-Biased Reciprocal Rank ensemble method. This improves user experience and provides more relevant results.

Scaling deep retrieval with TensorFlow Recommenders and Vector Search: Explains how to build a playlist recommendation system using TensorFlow Recommenders and Vector Search, covering deep retrieval models, training, deployment, and scaling.

Gen AI in Use

Gen AI: retrieval for RAG and Agents

Vertex AI and Denodo unlock enterprise data with Gen AI: Showcases how Vertex AI's integration with Denodo enables businesses to use generative AI for gaining insights from their data. Vector Search is key for efficiently accessing and analyzing relevant data within an enterprise environment.

Infinite Nature and the nature of industries: This 'wild' demo shows the diverse possibilities of AI: Showcases a demo that illustrates AI's potential across different industries. It utilizes Vector Search to power generative recommendations and multimodal semantic search.

Infinite Fleurs: Discover AI-assisted creativity in full bloom: Google's Infinite Fleurs, an AI experiment using Vector Search, Gemini and Imagen models, generates unique flower bouquets based on user prompts. This technology showcases AI's potential to inspire creativity across various industries.

LlamaIndex for RAG on Google Cloud: Describes how to use LlamaIndex to facilitate Retrieval Augmented Generation (RAG) with large language models. LlamaIndex utilizes Vector Search to retrieve relevant information from a knowledge base, resulting in more accurate and contextually appropriate responses.

RAG and grounding on Vertex AI: Examines RAG and grounding techniques on Vertex AI. Vector Search helps identify relevant grounding information during retrieval, which makes generated content more accurate and reliable.

Vector Search on LangChain: provides a guide to using Vector Search with LangChain for building and deploying a vector database index for text data, including question-answering and PDF processing.

Computer Data Analytics Icon

BI, data analytics, monitoring, and more

Enabling real-time AI with Streaming Ingestion in Vertex AI: Explores Streaming Update in Vector Search and how it provides real-time AI capabilities. This technology allows for real-time processing and analysis of incoming data streams.

Related resources

You can use the following resources to get started with Vector Search:

Notebooks and solutions

Vertex AI Vector Search QuickstartGetting Started with Text Embeddings and Vector Search

Vertex AI Vector Search Quickstart: Provides an overview of Vector Search. It is designed for users who are new to the platform and want to get started quickly.

Getting Started with Text Embeddings and Vector Search: Introduces text embeddings and vector search. It explains how these technologies work and how they can be used to improve search results.

A Hybrid Search Tutorial with Vector SearchGemini RAG Engine with Vector Search

Combining Semantic & Keyword Search: A Hybrid Search Tutorial with Vertex AI Vector Search: Provides instructions on how to use Vector Search for hybrid search. It covers the steps involved in setting up and configuring a hybrid search system.

Vertex AI RAG Engine with Vector Search: Explores the use of Vertex AI RAG Engine with Vector Search. It discusses the benefits of using these two technologies together and provides examples of how they can be used in real-world applications.

Infrastructure for a RAG-capable generative AI application using Vertex AI and Vector SearchThe Google Cloud architecture

Infrastructure for a RAG-capable generative AI application using Vertex AI and Vector Search: Details the architecture for building a generative AI application and RAG using Vector Search, Cloud Run and Cloud Storage, covering use cases, design choices, and key considerations.

Implement two-tower retrieval for large-scale candidate generation: Provides a reference architecture that shows you how to implement an end-to-end two-tower candidate generation workflow with Vertex AI. The two-tower modeling framework is a powerful retrieval technique for personalization use cases because it learns the semantic similarity between two different entities, such as web queries and candidate items.

Training

Getting Started with Vector Search and EmbeddingsVector Search is used to find similar or related items. It can beused for recommendations, search, chatbots, and text classification. The processinvolves creating embeddings, uploading them to Google Cloud, and indexing themfor querying. This lab focuses on text embeddings using Vertex AI, butembeddings can be generated for other data types.

Vector Search and EmbeddingsThis course introduces Vector Search and describes how it can beused to build a search application with large language model (LLM) APIs forembeddings. The course consists of conceptual lessons on Vector Searchand text embeddings, practical demos on how to build Vector Searchon Vertex AI, and a practice lab.

Understanding and Applying Text EmbeddingsThe Vertex AI Embeddings API generates text embeddings, which are
numerical representations of text used for tasks like identifying similar items.

In this course, you'll use text embeddings for tasks like classification andsemantic search, and combine semantic search with LLMs to build question-answeringsystems using Vertex AI.

Machine Learning Crash Course: EmbeddingsThis course introduces word embeddings, contrasting them with sparse representations.It explores methods for obtaining embeddings and differentiates between staticand contextual embeddings.

Related products

Vertex AI EmbeddingsProvides an overview of Embeddings API. Text and multimodal embedding use cases,along with links to additional resources and related Google Cloud services.

Vertex AI Search ranking APIThe ranking API reranks documents based on relevance to a query using a pre-trainedlanguage model, providing precise scores. It's ideal for improving search resultsfrom various sources including Vector Search.

Vertex AI Feature StoreLets you manage and serve feature data using BigQuery as the data source.It provisions resources for online serving, acting as a metadata layer to servethe latest feature values directly from BigQuery. Feature Store allowsfor the instant retrieval of feature values for the items Vector Store returnedfor queries.

Vertex AI PipelinesVertex AI Pipelines enables the automation, monitoring, and governanceof your ML systems in a serverless manner by orchestrating ML workflows with MLpipelines. You can run ML pipelines defined using Kubeflow Pipelines or theTensorFlow Extended (TFX) framework in batches. Pipelines allows forbuilding automated pipelines to generate embeddings, create and updateVector Search indexes, and form an MLOps setup for production searchand recommendation systems.

Deep dive resources

Enhancing your gen AI use case with Vertex AI embeddings and task typesFocuses on improving Generative AI applications using Vertex AI Embeddingsand task types. Vector Search can be used with task type embeddingsto enhance the context and accuracy of generated content by finding more relevantinformation.

TensorFlow RecommendersAn open-source library for building recommendation systems. It simplifies theprocess from data preparation to deployment and supports flexible model building.TFRS offers tutorials and resources and enables the creation of sophisticatedrecommendation models.

TensorFlow RankingTensorFlow Ranking is an open-source library for building scalableneural learning-to-rank (LTR) models. It supports various loss functions andranking metrics, with applications in search, recommendation, and other fields.The library is actively developed by Google AI.

Announcing ScaNN: Efficient Vector Similarity SearchGoogle's ScaNN, an algorithm for efficient vector similarity search, utilizesa novel technique to improve accuracy and speed in finding nearest neighbors.It outperforms existing methods and has broad applications in machine learningtasks requiring semantic search. Google's research efforts span various areas,including foundational ML and societal impacts of AI.

SOAR: New algorithms for even faster Vector Search with ScaNNGoogle's SOAR algorithm improves Vector Search efficiency by introducingcontrolled redundancy, allowing faster searches with smaller indexes. SOAR assignsvectors to multiple clusters, creating "backup" search paths for improved performance.

Related videos


Get Started with Vector Search using Vertex AI

Vector Search is a powerful tool for building AI-powered applications.This video introduces the technology and provides a step-by-step guide to gettingstarted.



Learn Hybrid Search with Vector Search

Vector Search can be used for hybrid search, allowing you to combinethe power of vector search with the flexibility and speed of a conventionalsearch engine. This video introduces hybrid search and shows you how to useVector Search for hybrid search.



You're Already Using Vector Search! Here's How to Be an Expert

Did you know you're probably using vector search every day without realizing it?From finding that elusive product on social media to tracking down a song stuckin your head, vector search is the AI magic behind these everyday experiences.



New "task type" embedding from the DeepMind team improves RAG search quality

Improve the accuracy and relevance of your RAG systems with newtask typeembeddings developed by the Google DeepMind team. Watch along and learn aboutthe common challenges in RAG search quality and how task type embeddings caneffectively bridge the semantic gap between questions and answers, leading tomore effective retrieval and enhanced RAG performance.

Vector Search terminology

This list contains some important terminology that you'll need to understand touse Vector Search:

  • Vector: A vector is a list of float values that has magnitude and direction.It can be used to represent any kind of data, such as numbers, points in space,and directions.

  • Embedding: An embedding is a type of vector that's used to representdata in a way that captures its semantic meaning. Embeddings are typicallycreated using machine learning techniques, and they are often used in naturallanguage processing (NLP) and other machine learning applications.

    • Dense embeddings: Dense embeddings represent the semantic meaningof text, using arrays that mostly contain non-zero values. With denseembeddings, similar search results can be returned based on semanticsimilarity.

    • Sparse embeddings: Sparse embeddings represent text syntax,using high-dimensional arrays that contain very few non-zero values comparedto dense embeddings. Sparse embeddings are often used for keyword searches.

  • Hybrid search: Hybrid search uses both dense and sparse embeddings, whichlets you search based on a combination of keyword search andsemantic search. Vector Search supports search based on denseembeddings, sparse embeddings, and hybrid search.

  • Index: A collection of vectors deployed together for similarity search.Vectors can be added to or removed from an index. Similarity searchqueries are issued to a specific index and search the vectors in that index.

  • Ground truth: A term that refers to verifying machinelearning for accuracy against the real world, like a ground truth dataset.

  • Recall: The percentage of nearest neighbors returned by the index that areactually true nearest neighbors. For example, if a nearest neighbor queryfor 20 nearest neighbors returned 19 of the ground truth nearest neighbors,the recall is 19/20x100 = 95%.

  • Restrict: Feature that limits searches to a subset of the index byusing Boolean rules. Restrict is also referred to as "filtering". WithVector Search, you can use numeric filtering and text attributefiltering.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.