Search and filter with vector embeddings Stay organized with collections Save and categorize content based on your preferences.
MYSQL_8_0_version.R20241208.01_00 or newer, whereversion is the minor version number. For information about how to upgrade yourinstance to a newer version that supports GA vector embeddings, seeSelf-service maintenance.For information about the syntax and behavior of vector embeddings for thePublic Preview version of this feature, seeWork with vector embeddings (Preview).
This page describes the different ways that you can query vector embeddings. Foran overview of ANN and KNN similarity searches, seevector searches.
Search approximate nearest neighbors (ANN)
To perform an ANN search, use theapprox_distance functionin aSELECT andORDER BY clause. You must use aLIMIT clause in an ANNsearch. You can also get the distance value by puttingapprox_distance in aSELECT list.
Use the following syntax for ANN queries:
#OrderingbydistanceSELECTtitleFROMbooksORDERBYapprox_distance(embedding,string_to_vector('[1,2,3]'),'distance_measure=l2_squared')LIMIT4;#SelectingthedistancevalueSELECTapprox_distance(embedding_name,string_to_vector('[1,2,3]'),'distance_measure=cosine,num_leaves_to_search=3')distFROMtableORDERBYdistLIMITlimit_value;Theapprox_distance function uses the following options:
embedding: uses the vector embedding column name from the base table.string_to_vectororvector_to_string: converts a vector to astring and a string to a vector to make the vector human readable.distance_measure: specify the distance measure to use for a vectorsimilarity search. This value must match the value you set in thedistance_measureparameter when you created the index. Thisparameter is required. Possible values for this parameter are:COSINEL2_SQUAREDDOT_PRODUCT
num_leaves_to_search: optional. specifies the number of leaves toprobe for an ANN vector similarity search. If you don't specify the numberof leaves, then Cloud SQL uses a value generated based on thesize of the table, number of leaves in the vector index, and other factors.You can view this value ininformation_schema.innodb_vector_indexes. Werecommend that you fine-tunenum_leaves_to_searchto achieve the bestbalance between search quality and performance for your specific workload.If increased, it impacts performance but improves recall.
The following example shows how to useapprox_distance to find the top Kclosest rows using thel2_squared distance measure and order the results bydistance.
#OrderingbydistanceSELECTtitleFROMbooksORDERBYapprox_distance(embedding,string_to_vector('[1,2,3]'),'distance_measure=l2_squared')LIMIT4;#SelectingthedistancevalueSELECTapprox_distance(embedding,string_to_vector('[1,2,3]'),'distance_measure=l2_squared')distFROMtableORDERBYdistLIMIT4;Filter results from approx_distance queries
You can use theapprox_distance function withWHERE conditions that filterquery results with a non-vector predicate to perform post filtering. Theapprox_distance function is evaluated before applying the filter which meansthat the number of results returned are nondeterministic.
For example, for the following query:
SELECTidFROMproductsWHEREprice <100ORDERBYapprox_distance(embedding,@query_vector,'distance_measure=cosine')LIMIT11;Theapprox_distance function returns the 11 nearest neighbors to the queryvector regardless of price. In post filtering, the products with a price< 100 are selected. It's possible that all of the nearest neighbors have aprice < 100, so there are 11 results to the query. Alternatively, if noneof the nearest neighbors have a price < 100, there are 0 rows returned.
If you anticipate that your filter in theWHERE condition is very selective, anexact search (KNN) is one option to help ensure that a sufficient numberof rows are returned.
Another option is to useiterative filteringto scan more of the ANN search index.
Use iterative filtering to increase ANN search results
You can use iterative filtering when the selective filtersof theWHERE clause in your ANN search queryare producing fewer results than the number of results specifiedin yourLIMIT clause.
For example, in the following query, when you enableiterative filtering, the query scans more of the vector indexminus the first set of filtered results.
EXPLAINFORMAT=TREESELECT*FROMt1WHEREnext_idBETWEEN15AND100ORDERBYapprox_distance(embedding,string_to_vector('[1,2,3]'),'distance_measure=l2_squared')LIMIT10;EXPLAIN->Limit:10row(s)(rows=10)->Vectorindexloopwithiterativefiltering->Vectorindexscanont1->Filter:(t1.next_idbetween15and100)->Single-rowindexlookupont1usingPRIMARY(id=t1.id)You fetch more neighbors from the search vector index iterativelyuntil the configured maximum (cloudsql_vector_iterative_filtering_max_neighbors)is reached. Any filter matches are counted toward theLIMIT and removedfrom the additional vector index scans.
Enable iterative filtering
By default, iterative filtering is turned off for all sessionsand Cloud SQL instances.
To enable iterative filtering for an existing session, use the following SQL statement.
SETSESSIONcloudsql_vector_iterative_filtering=on;You can also enable iterative filteringglobally for all client sessions that connect to the instance by settingthe flag on the instance. To set a flag for an instance, seeSet a database flag.
For more information about setting system variables at the session orglobal level, seeUsing System Variablesin the MySQL documentation.
Tune iterative filtering
To control how many nearest neighbors are returned for an ANN search querywith iterative filtering enabled, you can usethecloudsql_vector_iterative_filtering_max_neighborssession or global system variable.You can use this configuration to increase thenumber of nearest neighbors that are requested. However,to avoid storing too many results in memory,the maximum for this variable is1000.
To set this variable for a session, use the following SQL statement:
SETcloudsql_vector_iterative_filtering_max_neighbors=600;The default is500, and the minimum number is10.
Limitations
The following are limitations with using iterative filtering:
Not a guarantee: when you use iterativefiltering, Cloud SQL attempts to findthe number of results specified in the
LIMITclause, but doesn't guarantee that the numberis found. This can happen if themaximum number of neighbors (cloudsql_vector_iterative_filtering_max_neighbors)is reached before theLIMITis satisfied, or if there aren't enoughrows that match the filter in the table.Complex queries: iterative filtering worksonly when filter predicates are pushed down tothe base table's access path.It isn't supported for filters on top of temporary tables; for example, tables that use a
HAVINGclause.In subqueries, only filters on the base table within the subquery itself are considered for iterative filtering.
Check the fallback status on ANN searches
There are certain cases where an ANN search falls back to a KNN search. Theseinclude the following:
- There's no vector index on the base table.
- There's a vector index on the base table, but it uses a differentdistance measure from the
distance_measureparameter in theapprox_distancesearch options. - The vector index is corrupt or invisible to the current transaction.
- The
LIMITspecified is greater than 10000. - There is no
LIMITspecified. - The current query involves more than one
approx_distancecall on the samebase table. - The optimizer calculates that it's more efficient to use KNN.
All of these cases push a warning to the client indicating that exact search wasperformed and the reason why.
Use the following command in the mysql client to view the fallback status:
SHOWglobalstatusLIKE'%cloudsql_vector_knn_fallback%';If you want to use ANN and it's falling back to KNN, the query might run slower.You should find the reason it's falling backand assess whether to make changes so that ANN is used instead.
Example: Create a vector index and run an ANN query
The following example walkthrough provides steps to create a vector index andrun an ANN query in Cloud SQL.
- Generate vector embeddings. You can create vector embeddings manually or usea text embedding API of your choice. For an example that usesVertex AI, seeGenerate vector embeddings based on row data.
Create a table in Cloud SQL that contains a vector embeddingcolumn with three dimensions.
CREATETABLEbooks(idINTEGERPRIMARYKEYAUTO_INCREMENT,titleVARCHAR(60),embeddingVECTOR(3)USINGVARBINARY);Insert a vector embedding into the column.
INSERTINTObooksVALUES((1,'book title',string_to_vector('[1,2,3]')));Commit the changes.
commit;Create the vector index using the
L2_squaredfunction to measure distance.CREATEVECTORINDEXvectorIndexONdbname.books(embeddings)USINGSCANNQUANTIZER=SQ8DISTANCE_MEASURE=l2_squared;Use the following syntax to perform an ANN search with a
LIMITof 4 searchresults:SELECTtitleFROMbooksORDERBYapprox_distance(embedding,string_to_vector('[1,2,3]'),'distance_measure=l2_squared')LIMIT4;SELECTapprox_distance(embedding,string_to_vector('[1,2,3]'),'distance_measure=cosine')distFROMbooksORDERBYdistLIMIT4;
Search K-nearest neighbors (KNN)
To perform a K-nearest neighbor search, use thevector_distance functionwith a distance measure option and a vector conversion function(string_to_vector orvector_to_string) in aSELECT statement. Use thefollowing syntax:
SELECTvector_distance(string_to_vector('[1,2,3]'),string_to_vector('[1,2,3]'),'Distance_Measure=dot_product');Replace the values [1,2,3] with the embedding values of your data.
The following example shows how to use this query with thecosine_distancefunction and thestring_to_vector vector conversion function.
SELECTid,cosine_distance(embedding,string_to_vector('[1,2,3]'))distFROMbooksORDERBYdistanceLIMIT10;Get the Cosine distance in a KNN query
Use the Cloud SQLcosine_distance functionto calculate the distance using cosine.
SELECTcosine_distance(embedding,string_to_vector('[3,1,2]'))ASdistanceFROMbooksWHEREid=10;Get the Dot Product distance in a KNN query
Use the Cloud SQLdot_product functionto calculate the distance using the dot product.
SELECTdot_product(embedding,string_to_vector('[3,1,2]'))ASdistanceFROMbooksWHEREid=10;Get the L2-squared distance in a KNN query
Use the Cloud SQLl2_squared_distance functionto calculate the distance using L2 squared.
SELECTl2_squared_distance(embedding,string_to_vector('[3,1,2]'))ASdistanceFROMbooksWHEREid=10;What's next
- Read theoverview about vector search on Cloud SQL.
- Learn how toenable and disable vector embeddings on your instance.
- Learn how togenerate vector embeddings.
- Learn how tocreate vector indexes.
- Learn how toperform searches on vector embeddings.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.