Choose among vector distance functions to measure vector embeddings similarity

Note: This feature is available with the Spanner Enterprise edition and Enterprise Plus edition. For more information, see the Spanner editions overview.

This page describes how to choose among the vector distance functions providedin Spanner to measure similarity between vector embeddings.

After you'vegenerated embeddings fromyour Spanner data, you can perform a similarity search using vectordistance functions. The following table describes the vector distance functionsin Spanner.

Function	Description	Formula	Relationship to increasing similarity
Dot product	Calculates the cosine of angle \(\theta\) multiplied by the product of corresponding vector magnitudes.	\(a_1b_1+a_2b_2+...+a_nb_n\) \(=\|a\|\|b\|cos(\theta)\)	Increases
Cosine distance	The cosine distance function subtracts the cosine similarity from one (`cosine_distance() = 1 - cosine similarity`). The cosine similarity measures the cosine of angle \(\theta\) between two vectors.	1 - \(\frac{a^T b}{\|a\| \cdot \|b\|}\)	Decreases
Euclidean distance	Measures the straight line distance between two vectors.	\(\sqrt{(a_1-b_1)^2+(a_2-b_2)^2+...+(a_N-b_N)^2}\)	Decreases

Choose a similarity measure

Depending on whether or not all your vector embeddings are normalized, you candetermine which similarity measure to use to find similarity. A normalizedvector embedding has a magnitude (length) of exactly 1.0.

In addition, if you know which distance function your model was trained with,use that distance function to measure similarity between your vectorembeddings.

Normalized data

If you have a dataset where all vector embeddings are normalized, then all threefunctions provide the same semantic search results. In essence, although eachfunction returns a different value, those values sort the same way. Whenembeddings are normalized,DOT_PRODUCT() is usually the most computationallyefficient, but the difference is negligible in most cases. However, if yourapplication is highly performance sensitive,DOT_PRODUCT() might help withperformance tuning.

Non-normalized data

If you have a dataset where vector embeddings aren't normalized,then it's not mathematically correct to useDOT_PRODUCT() as a distancefunction because dot product as a function doesn't measure distance. Dependingon how the embeddings were generated and what type of search is preferred,either theCOSINE_DISTANCE() orEUCLIDEAN_DISTANCE() function producessearch results that are subjectively better than the other function.Experimentation with eitherCOSINE_DISTANCE() orEUCLIDEAN_DISTANCE() mightbe necessary to determine which is best for your use case.

Unsure if data is normalized or non-normalized

If you're unsure whether or not your data is normalized and you want to useDOT_PRODUCT(), we recommend that you useCOSINE_DISTANCE() instead.COSINE_DISTANCE() is likeDOT_PRODUCT() with normalization built-in.Similarity measured usingCOSINE_DISTANCE() ranges from0 to2. A resultthat is close to0 indicates the vectors are very similar.

What's next

Learn more about how toperform a vector search by finding the k-nearest neighbor.
Learn how toexport embeddings to Vertex AI Vector Search.
Learn more about theGoogleSQLCOSINE_DISTANCE(),EUCLIDEAN_DISTANCE(), andDOT_PRODUCT() functions.
Learn more about thePostgreSQLspanner.cosine_distance(),spanner.euclidean_distance(), and spanner.dot_product() functions.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.

Movatterモバイル変換

Choose among vector distance functions to measure vector embeddings similarity Stay organized with collections Save and categorize content based on your preferences.

Choose a similarity measure

What's next

Choose among vector distance functions to measure vector embeddings similarity