Work with vector embeddings (Preview)

MySQL | PostgreSQL | SQL Server

Preview — Cloud SQL for MySQL vector search and storage

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of the Service Specific Terms. You can process personal data for this feature as outlined in theCloud Data Processing Addendum, subject to the obligations and restrictions described in the agreement under which you access Google Cloud. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

This page provides information about the Previewversion of vector embeddings. If you want to use the GA version of this feature,the instance maintenance version must be version`MYSQL_8_0_36.R20241208.01_00` or later. For information about how toupgrade your instance to a newer version that supports GA vector embeddings, see Upgrade your instance.

For information about the syntax and behavior of vector embeddings for theGA version of this feature, seeVector search.

This page details how you can interact with Cloud SQL to build applicationsthat use vector embeddings.

Cloud SQL for MySQL supports the storage of vector embeddings. You can thencreate vector search indexes and perform similarity searches on these vectorembeddings along with the rest of the data that you store in Cloud SQL.

Vector embedding storage

You can use Cloud SQL for MySQL to store vector embeddings by creating a vectorembedding column in a table. The special vector embedding column maps to theVARBINARY data type. Like other relational data in the table, you can accessvector embeddings in the table with existing transactional guarantees. A tablethat has a vector embedding column is a regular InnoDB table and is thereforecompliant with atomicity, consistency, isolation, and durability (ACID)properties. ACID properties deviate only for vector search index lookups.

Note: To distinguish the vector embedding column from other columns,Cloud SQL for MySQL uses a specialCOMMENT andCONSTRAINT.The constraint is required for input validation, and the vector embedding columnannotation is visible as aCOMMENT.You can't modify or delete the comment or the constraint.

Consider the following when setting up a table for vector embeddings:

You can create a maximum of one vector embedding column in a table and onevector search index per table. Each vector embedding stored in the samecolumn must have exactly the same dimensions that you specified when youdefined the column. A vector embedding has an upper limit of 16,000dimensions. If you have enough storage and memory available, then you canhave separate tables with different vector embedding columns and vectorsearch indexes on the same instance.

While there's no hard limit to the number of vector embeddings that you canstore in a table, vector search indexes require memory. For this reason, werecommend that you store no more than 10 million vector embeddings in atable.

Also see the list of Limitations.

Replication works the same way for the vector embedding column as it does forother MySQL InnoDB columns.

Similarity search

Cloud SQL supports similarity search using bothK-nearest neighbor (KNN) andapproximate nearest neighbor (ANN) search queries. You can useboth types of vector searches in your Cloud SQL instances. You cancreate a vector search index for ANN searches only.

Important: The SQL syntax provided for performing operations on vector searchindexes is subject to change prior to the GA release of this feature based onPreview feedback.

K-nearest neighbor (KNN) search

Cloud SQL supports querying using KNN vector search, also referred toas exact nearest neighbor search. Performing a KNN vector search providesperfect recall. You can perform KNN searches without having to create a vectorsearch index. KNN search is based on performing a table scan algorithm.

For KNN search, Cloud SQL also supports the following vector distancesearch functions:

Cosine
Dot product
L2 squared distance

For more information about using vector search distance functions, see Querythe distance of a vector embedding.

Approximate nearest neighbor (ANN) search

Cloud SQL supports creating and querying ANN searches through thecreation of vector search indexes. An ANN vector search index lets you optimizefor fast performance instead of perfect recall. For ANN search,Cloud SQL supports the following index types:

BRUTE_FORCE: the default vector search index type for a base table thathas fewer than 10,000 rows. This type is best suited for searches within asmaller subset of an original dataset. The memory used by the index is equalto the size of the dataset. This index type isn't persisted to disk.
TREE_SQ: the default vector search index type for a base table that has10,000 or more rows. This type uses the least amount of memory orapproximately 25% of the size of the dataset.TREE_SQ indexes arepersisted to disk.
TREE_AH: a vector search index type that provides an asymmetric hashingsearch type algorithm. As implemented in Cloud SQL, this index typeisn't optimized for memory footprint and isn't persisted.

Update vector search indexes

Cloud SQL for MySQL updates vector search indexes in real time. Any transactionthat performs Data Manipulation Language (DML) operations on the base table alsopropagates changes to the associated vector search indexes. The changes in avector search index are visible immediately to all other transactions, whichmeans an isolation level ofREAD_UNCOMMITTED.

If you roll back a transaction, then the corresponding rollback changes alsooccur in the vector search index.

Replication of vector search indexes

Cloud SQL for MySQL replicates vector search indexes to all read replicas.Replication filters and the replication of vector search indexes to cascadingreplicas aren't supported.

Configure an instance to support vector embeddings

This section describes how to configure your Cloud SQL instance tosupport the storage, indexing, and querying of vector embeddings.

Both Cloud SQL Enterprise edition and Cloud SQL Enterprise Plus edition instances support vector embeddings.

Before you begin

Your instance must be running Cloud SQL for MySQL MySQL version8.0.36.R20240401.03_00 or later.
Your instance must havesufficient disk space to allocate memoryfor the total number of vector embeddings on the instance.

Enable vector embeddings

To turn on support for vector embeddings, you must configure MySQL databaseflags.

gcloudsqlinstancespatchINSTANCE_NAME\--database-flags=FLAGS

ReplaceINSTANCE_NAME with the name of the instance on which you wantto enable vector embedding support.

InFLAGS, configure the following MySQL flags on your instance:

cloudsql_vector: set this flag toon to enable vector embedding storageand search support. You can create new vector embedding columns and vectorsearch indexes on the instance.
cloudsql_vector_max_mem_size: optional. Specify the maximum memoryallocation in bytes for all vector search indexes on the instance. If youdon't specify this flag, then the default memory allocation is 1 GB, whichis the minimum memory allocation. For more information about how tocalculate the amount to specify, seeConfigure the memory allocation forvector search indexes.
This dedicated memory comes from the memory allocated to yourinnodb_buffer_pool_size. Your available buffer pool is reduced by the sameamount. The maximum allowed value for this flag is50% of your totalinnodb_buffer_pool_size.
If you specify a value that's greater than 50% of your totalinnodb_buffer_pool_size, then Cloud SQL reduces the effectivevalue to50% of the available size and logs a warning message for the instance.

After you configure the flags, your command might look similar to the following:

gcloudsqlinstancespatchmy-instance\--database-flags=cloudsql_vector=on,cloudsql_vector_max_mem_size=4294967296

The flags to configure vector embeddings support in Cloud SQL for MySQLare static flags. After you update the instance with the flags, your instancerestarts automatically in order for the configuration changes to take effect.

For more information about how to configure database flags for MySQL, seeConfigure database flags.

Disable vector embeddings

To disable vector embeddings, set thecloudsql_vector flag tooff.

For example:

gcloudsqlinstancespatchINSTANCE_NAME\--database-flags=cloudsql_vector=off

ReplaceINSTANCE_NAME with the name of the instance on which you'returning off vector embedding support.

Settingcloudsql_vector tooff prevents you from creating new vectorembedding columns and vector search indexes. After you configure this staticflag, the instance restarts automatically for the configuration changeto take effect.

After the restart of the instance, Cloud SQL for MySQL does the following:

Removes all persistedTREE_SQ vector search indexes from the persistentdisk.
Keeps the data dictionary table entries for the vector search indexes thathave been built. However, Cloud SQL for MySQL doesn't rebuild the indexes andany search queries to these indexes return an error.
Continues to store the vector embeddings in the base tables. The vectorembeddings remain accessible.

If you later re-enable thecloudsql_vector flag for the instance, thenCloud SQL attempts to rebuild the indexes while the instance restartsbased on the entries in the data dictionary table.

Read replica configuration

If the instance meets themaintenance version and flag enablementcriteria, then Cloud SQL fully supports vectorembeddings on a read replica.

If you create a replica from a primary instance that has vector embeddingsupport enabled, then the read replica inherits the vector embedding supportsettings from the primary instance. You must enable vector embedding supportindividually on already existing read replica instances.

In terms of impact to replication lag, creating and maintaining of vector searchindexes operates in the same way as regular MySQL indexes.

Vector search indexes aren't supported on cascading replicas.

Example: An ANN vector search index and query

The following example walkthrough provides steps to create an ANN-based vectorsearch index and query in Cloud SQL.

Generate vector embeddings. You can create vector embeddings manually or usea text embedding API of your choice. For an example that usesVertex AI, seeGenerate vector embeddings based on rowdata.

Create a table in Cloud SQL for MySQL that contains a vector embedding columnwith three dimensions.

CREATETABLEbooks(idINTEGERPRIMARYKEYAUTO_INCREMENT,titleVARCHAR(60),embeddingVECTOR(3)USINGVARBINARY);

Insert a vector embedding into the column.

INSERTINTObooksVALUES(1,'book title',string_to_vector('[1,2,3]'));

Commit the changes.
```
commit;
```

Create the vector search index. If you're creating aTREE_SQ or aTREE_AH index, then your table must have at least 1,000 rows.

CALLmysql.create_vector_index('vectorIndex','dbname.books','embedding','index_type=BRUTE_FORCE, distance_measure=L2_SQUARED');

Get the nearest neighbors.

SELECTtitleFROMbooksWHERENEAREST(embedding)TO(string_to_vector('[1,2,3]'));

Generate vector embeddings based on row data

You can generate a vector embedding for a given row's data by using a textembedding API such asVertex AIorOpenAI.You can use any text embedding API with Cloud SQL vector embeddings.However, you must use the same text embedding API for query string vectorgeneration. You can't combine different APIs for source data and queryvectorization.

For example, you can generate a vector embedding from Vertex AI:

fromvertexai.language_modelsimportTextEmbeddingModeldeftext_embedding()->list:"""Text embedding with a Large Language Model."""model=TextEmbeddingModel.from_pretrained("text-embedding-004")embeddings=model.get_embeddings(["What is life?"])forembeddinginembeddings:vector=embedding.valuesprint(f"Length of Embedding Vector:{len(vector)}")returnvectorif__name__=="__main__":text_embedding()

Store vector embeddings

This section provides examples statements for storing vector embeddings inCloud SQL.

Create a new table with a vector embedding column

CREATETABLEbooks(idINTEGERPRIMARYKEYAUTO_INCREMENT,titleVARCHAR(60),embeddingVECTOR(3)USINGVARBINARY);

Add a vector embedding column to an existing table

ALTERTABLEbooksADDCOLUMNembeddingVECTOR(3)USINGVARBINARY;

Insert a vector embedding

INSERTINTObooks(title,embedding)VALUES('book title',string_to_vector('[1,2,3]'));

Insert multiple vector embeddings

INSERTINTObooks(title,embedding)VALUES('book title',string_to_vector('[1,2,3]')),('book title',string_to_vector('[4,5,6]'));

Upsert a vector embedding

INSERTINTObooks(id,title,embedding)VALUES(1,'book title',string_to_vector('[1,2,3]'))ONDUPLICATEKEYUPDATEembedding=string_to_vector('[1,2,3]');

Update a vector embedding

UPDATEbooksSETembedding=string_to_vector('[1,2,3]')WHEREid=1;

Delete a vector embedding

DELETEFROMbooksWHEREembedding=string_to_vector('[1,2,3]');

Work with vector search indexes

By default, you can perform the exact nearest neighbor search, which providesthe perfect recall. You can also add an index to use ANN search, which tradessome recall for speed. Unlike typical indexes, after you add an approximateindex, you see different results for queries.

Important: If a vector search index is present on a table, then you can'tperform any data definition language (DDL) operations on the table. Beforeperforming the DDL operation,drop the vector search index on thebase table.

Recommendations

This section provides best practices for working with vector search indexes.Every workload is different, and you might need to adjust accordingly.

Before you create a vector search index, you must load data into thetable. Your base table must have at least 1,000 rows. These requirementsapply only to theTREE_SQ andTREE_AH search index types. If you havemore data points available, then you'll have better partitioning andtraining of the index.
Monitor the memory usage of indexes. If the instance runs out of memory,then you can't create or build any indexes. For existing indexes, afterreaching the threshold, Cloud SQL writes warnings to the MySQLerror log periodically. You can view memory usage in theinformation_schema.innodb_vector_indexes table.
If the underlying base table has undergone major DML changes, then rebuildthe vector search indexes. To get the initial size of the index at buildtime and the current size of the index, query theinformation_schema.innodb_vector_indexes table.
Generally, it's acceptable to leave the number of partitions to be computedinternally. If you have a use case where you want to specify the number ofpartitions, then you must haveat least 100 data pointsper partition.

Read-only base table during vector search index operations

For the duration of all three vector search index operations—create,alter, and drop— the base table is put into a read-only mode.During these operations, no DMLs are allowed on the base table.

Persistence, shutdown, and impact on maintenance

Only vector search indexes that use theTREE_SQ type persist to diskon a clean shutdown of an instance shutdown.Vector search indexes that use theTREE_AH andBRUTE_FORCE types arein-memory only.

After a clean shutdown of an instance, Cloud SQL reloads vector searchindexes as the instance restarts. However, after a crash or an unclean shutdown,Cloud SQL must rebuild the vector search indexes. For example, any timethat your instance undergoes a crash and recovery from backup and restore,point-in-time recovery (PITR), or high-availability (HA) failover,Cloud SQL rebuilds your vector search indexes. For these events, thefollowing occurs:

The rebuild happens in the background automatically.
During the rebuild, the base table is in read-only mode.
If the automatic rebuild can't get a lock on the table within a specifictimeout period, then the rebuild fails. You might need torebuild the indexmanually instead.

The time required for an index rebuild might increase the time required for ashutdown, which might also increase the required maintenance and update time foran instance.

Configure the memory allocation for vector search indexes

Cloud SQL builds and maintains vector search indexes in memory. TheTREE_SQ index type persists on a clean shutdown and reloads after the instancerestarts. During runtime, all vector search indexes need to stay in memory.

To make sure that Cloud SQL has enough memory available to keep allvector search indexes in memory, configure the Cloud SQL instance withacloudsql_vector_max_mem_size database flag.cloudsql_vector_max_mem_sizegoverns how much memory the Cloud SQL instance dedicates for vectorsearch indexes. When you configure the value for the flag, keep the following inmind:

The default and minimum value is 1 GB. The upper limit is 50% of the bufferpool size.
After you set this flag, your instance automatically restarts for theconfiguration change to take effect.
If your instance has used up all its configured memory, you can't create oralter any vector search indexes.

Important: The memory that you allocate to vector search indexes is used outsideof the buffer pool. When you enablecloudsql_vector, Cloud SQLreduces the buffer pool size by the value you assign to thecloudsql_vector_max_mem_size flag.

To update the memory allocated for vector search indexes on the instance, changethe value of thecloudsql_vector_max_mem_size flag.

gcloudsqlinstancespatchINSTANCE_NAME\--database-flags=cloudsql_vector_max_mem_size=NEW_MEMORY_VALUE

Replace the following:

INSTANCE_NAME: the name of the instance on which you are changingthe memory allocation.
NEW_MEMORY_VALUE: the updated memory allocation, in bytes, foryour vector search indexes.

This change restarts your instance automatically so that the change can takeeffect.

Calculate required memory

The amount of memory that an index requires depends on the index type, thenumber of vector embeddings, and the dimensionality of the embeddings. There aretwo memory requirements to consider:

Build time memory: the memory required during the build of the index.
Index memory: the memory that the index occupies after the index isbuilt.

For a given index, its dataset size is the memory needed to read all the vectorembeddings in memory. Given that each dimension is represented by a float whichuses 4 bytes of memory, you can determine the dataset_size as follows:

dataset_size = <num_embeddings> * (4 * <dimensions>)

For example, if you have one million embeddings of 768dimensions, yourdataset_size is 3 GB.

Based on the previous example, the memory requirements for the different indextypes are as follows:

Index type	Build time memory	Index memory
`TREE_SQ`	4 GB	1 GB
`TREE_AH`	3.5 GB	3.5 GB
`BRUTE_FORCE`	3 GB	3 GB

If you're usingTREE_SQ vector search indexes, then you must alsofactor in the memory required for persistence at runtime. To the total amount ofmemory in your configuration, add the amount of index memory used by the largestactiveTREE_SQ vector search index.

Whenever the base table where the vector embeddings are stored undergoes DMLoperations, the vector search index is updated in real time. These updateschange the memory footprint of the index, which can shrink or expand dependingon the DML operation. You can monitor the memory footprint of an index byquerying theinformation_schema.innodb_vector_indexes table. For informationabout monitoring the size of your vector search index, see Monitor vectorsearch indexes.

Create a vector search index

The statement to create a vector search index uses the following syntax:

CALL mysql.create_vector_index('INDEX_NAME',                                'DB_NAME.TABLE_NAME',                                'COLUMN_NAME',                                'PARAMETERS'                              );

For example:

CALLmysql.create_vector_index('vectorIndex','db.books','embedding','index_type=TREE_SQ, distance_measure=l2_squared');

The index name that you specify must be unique within the database.

Vector search index parameters

Themysql.create_vector_index andmysql.alter_vector_index functions supportmultiple parameters that you can specify with comma-separated key-value pairs.Allmysql.create_vector_index function parameters are optional. If you specifyan empty string or NULL, then the default parameter values are configured forthe index.

distance_measure: the supported values are:L2_SQUARED,COSINE, andDOT_PRODUCT.L2_SQUARED is the default.
num_neighbors: the number of neighbors to return from an ANN query. Youcan also override this parameter when performing the search query. Thedefault is10.
index_type: specifies the type of index to be built. Valid values are:BRUTE_FORCE,TREE_SQ, andTREE_AH.
- BRUTE_FORCE is the default for a table that has fewer than10,000 rows
- TREE_SQ is the default for a table that has 10,000 or more rows
To specify theTREE_AH orTREE_SQ index type, the size of your base table must be greater than 1,000 rows.
num_parititions: specifies the number of K-means clusters to build. Thisparameter is only allowed if you have configured anindex_type. Thisoption isn't applicable toBRUTE_FORCE. If you specify theTREE_SQ orTREE_AH index type, then the size of your base table must be greater thanor equal tonum_partitions * 100.

Note: The size of your base table is calculated by scanning the table for thenumber of rows with non-NULL entries in the vector embeddings column.

Alter a vector search index

CALL mysql.alter_vector_index('DB_NAME.INDEX_NAME', 'PARAMETERS');

Thealter_vector_index function is used explicitly to rebuild a vector searchindex. To use this function, the index must already exist. You might want torebuild an index for the following use cases:

To rebuild the index with different options. For example, you might want touse a different index type or different distance measure.
To rebuild the index because the base table has undergone major DML changes.For example, you need to retrain the vector search index based on the datain the base table.

All parameters for rebuilding the index are identical tothe ones available for creating the index and are also optional. If you specifyan empty string or NULL when you rebuild the index, then the index is rebuiltbased on the parameters specified at the time of index creation. If noparameters are provided at the time of index creation, then the defaultparameter values are used.

The existing vector search index is available during thealter_vector_indexoperation. You can still perform search queries against the index.

Drop a vector search index

You can't perform a DDL operation on a table that has a vector search index.Before performing the DDL operation on the table, you must drop the vectorsearch index.

CALL mysql.drop_vector_index('DB_NAME.INDEX_NAME');

Query vector embeddings

This section provides examples for the different ways that you can query vectorembeddings.

View the vector embeddings

SELECTvector_to_string(embedding)FROMbooks;

Get the exact neighbor search to a vector embedding

SELECTid,cosine_distance(embedding,string_to_vector('[1,2,3]'))distFROMbooksORDERBYdistLIMIT10;

Get the approximate neighbor search to a vector embedding

SELECTtitleFROMbooksWHERENEAREST(embedding)TO(string_to_vector('[1,2,3]'),'num_neighbors=10');

Performing an ANN search supports two parameters. Both are optional.

num_partitions: specify the number of partitions to probe for an ANNvector search. If you don't specify the number of partitions, then thesearch uses a value generated based on the size of the table, number ofpartitions in the vector search index, and other factors.
num_neighbors: specify the number of neighbors to return. This valueoverrides the value set at the time of creation of the vector search index.

Filter vector embeddings

Use extra columns as predicates to fine tune the filtering of your vectorembedding query results. For example, if you add aprintyear column, then youcan add a specific year value as a filter to your query.

SELECTtitleFROMbooksWHERENEAREST(embedding)TO(string_to_vector('[1,2,3]'))ANDprintyear >1991;

Query the distance of a vector embedding

This section provides examples of vector distance functions that are availablefor KNN search.

Get the Cosine distance

SELECTcosine_distance(embedding,string_to_vector('[3,1,2]'))ASdistanceFROMbooksWHEREid=10;

Get the Dot Product distance

SELECTdot_product(embedding,string_to_vector('[3,1,2]'))ASdistanceFROMbooksWHEREid=10;

Get the L2 Squared distance

SELECTl2_squared_distance(embedding,string_to_vector('[3,1,2]'))ASdistanceFROMbooksWHEREid=10;

Get rows within a certain distance

SELECT*FROMbooksWHEREl2_squared_distance(embedding,string_to_vector('[1,2,3]')) <10;

You can combine withORDER BY andLIMIT

SELECTid,vector_to_string(embedding),l2_squared_distance(embedding,string_to_vector('[1,2,3]'))distFROMbooksORDERBYdistLIMIT10;

Monitor vector search indexes

To get real-time information about all the vector search indexes in theinstance, use theinformation_schema.innodb_vector_indexes table.

To view the table, run the following command:

SELECT*FROMinformation_schema.innodb_vector_indexes;

Sample output might look like the following:

***************************1.row***************************INDEX_NAME:test.t4_indexTABLE_NAME:test.t4_bfINDEX_TYPE:BRUTE_FORCEDIST_MEASURE:SquaredL2DistanceSTATUS:ReadySTATE:INDEX_READY_TO_USEPARTITIONS:0SEARCH_PARTITIONS:0INITIAL_SIZE:40000CURRENT_SIZE:40000QUERIES:0MUTATIONS:0INDEX_MEMORY:160000DATASET_MEMORY:0

In theinformation_schema.innodb_vector_indexes table, you can view thefollowing:

The options that are potentially generated. In other words,num_partitionsor the number of partitions to probe for a query.
TheSTATE andSTATUS columns tell you the current state of the index.During the build phase, the status column gives information about how farthe vector search index is in the build phase.
TheINITIAL_SIZE column provides the table size during index creation. Youcan compare this size withCURRENT_SIZE to get an idea on how much theindex has changed since its creation due to DMLs on the base table.
TheQUERIES andMUTATIONS columns provide you with real-time insightsinto how busy the index is.
TheINDEX_MEMORY andDATASET_MEMORY columns provide information aboutmemory consumption of the index.INDEX_MEMORY indicates how much memory isconsumed by the index andDATASET_MEMORY indicates how much additionalmemory is consumed during build time.

To get a list of the search vector indexes created on the instance, you can viewthemysql.vector_indexes data dictionary table.

To view the table, run the following command:

SELECT*FROMmysql.vector_indexes;

Sample output:

***************************1.row***************************index_name:test.index1table_name:test.t1column_name:jindex_options:index_type=BRUTE_FORCE,distance_measure=L2_SQUAREDstatus:ACTIVEcreate_time:2024-04-0822:46:21update_time:2024-04-0822:46:211rowinset(0.00sec)

Limitations

There can only be one vector embedding column per table.
There can only be one vector search index per table.
A vector embedding can have up to 16,000 dimensions.
InnoDB table-level partitioning on tables with vector embedding columnsisn't supported.
If the instance restarts from an unclean shutdown, then Cloud SQLrebuilds the vector search index automatically.
1. While rebuilding the vector search index, the base table is read-only.
2. If Cloud SQL can't acquire a lock on the table within thespecified time, then the automatic rebuild of the index might fail.
3. If automatic rebuilding of the index fails, then you must rebuild theindex manually.
To add a vector embedding column, the table must have a primary key.Cloud SQL doesn't support primary keys of the typeBIT,BINARY,VARBINARY,JSON,BLOB,TEXT, or spatial data types. Compositeprimary keys can't include any of these types.
If a vector search index is present on a table, then DDL operations aren'tallowed. The vector search index must be dropped before performing DDLoperations on the base table.
Vector embeddings aren't supported on non-InnoDB tables or on temporarytables.
The vector embedding column can't be a generated column.
TheNEAREST..TO predicate can be combined with other "scalar" predicatesby usingAND orOR. The scalar predicates on the table are evaluatedafter the vector predicates are applied.
TheNEAREST..TO predicate is supported only in aSELECT statement. OtherDML statements don't supportNEAREST..TO.
Subqueries aren't supported withNEAREST..TO. A constraint can't be addedto the primary key of the base table if a vector search index is present.

Pre-filtering is feasible only through distance functions and by usingORDER BY withLIMIT.

For example, if you create the following table:

CREATETABLEbooks(bookidINTPRIMARYKEY,titleVARCHAR(1000),authorVARCHAR(100),printyearint,countryVARCHAR(100),bvectorVECTOR(1536)USINGVARBINARY//bvectorisembeddingvectorofbook'splot,genre,reviewsetc);

Then you might use the following query to achieve pre-filtering.

//selectquerytoobtainbooksbyspecificauthorandhavingsimilarplot-genre-reviewsSELECTbookid,title,author,l2_squared_distance(bvector,qvector)distFROMbookswhereauthor='cloudsql'ORDERBYdistLIMIT10

Post-filtering is supported withNEAREST..TO and distance functions.

Troubleshoot

In the event of a crash, the index is rebuilt automatically. When a rebuild isin progress, there are two restrictions:

During index creation, the base table is in read-only mode.
While the index is being rebuilt, ANN queries against existing indexes fail.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.

Movatterモバイル変換

Work with vector embeddings (Preview) Stay organized with collections Save and categorize content based on your preferences.

Vector embedding storage

Similarity search

K-nearest neighbor (KNN) search

Approximate nearest neighbor (ANN) search

Update vector search indexes

Replication of vector search indexes

Configure an instance to support vector embeddings

Before you begin

Enable vector embeddings

Disable vector embeddings

Read replica configuration

Example: An ANN vector search index and query

Generate vector embeddings based on row data

Store vector embeddings

Create a new table with a vector embedding column

Add a vector embedding column to an existing table

Insert a vector embedding

Insert multiple vector embeddings

Upsert a vector embedding

Update a vector embedding

Delete a vector embedding

Work with vector search indexes

Recommendations

Read-only base table during vector search index operations

Persistence, shutdown, and impact on maintenance

Configure the memory allocation for vector search indexes

Calculate required memory

Create a vector search index

Vector search index parameters

Alter a vector search index

Drop a vector search index

Query vector embeddings

View the vector embeddings

Get the exact neighbor search to a vector embedding

Get the approximate neighbor search to a vector embedding

Filter vector embeddings

Query the distance of a vector embedding

Get the Cosine distance

Get the Dot Product distance

Get the L2 Squared distance

Get rows within a certain distance

Monitor vector search indexes

Limitations

Troubleshoot

What's next

Work with vector embeddings (Preview)