The ML.DISTANCE function
This document describes theML.DISTANCE scalar function, which lets youcompute the distance between two vectors.
VECTOR_SEARCH functionis another vector function that calculates the distance between vectors. Youshould use theVECTOR_SEARCH function if you need to search a dataset forvectors similar to an input vector. You should use theML.DISTANCE functionif you need to compare two specific vectors to determine the distance betweenthem.Syntax
ML.DISTANCE(vector1, vector2 [, type])
Arguments
ML.DISTANCE has the following arguments:
vector1: anARRAYvalue that represents the first vector, in one of thefollowing forms:ARRAY<Numerical type>ARRAY<STRUCT<STRING, Numerical type>>ARRAY<STRUCT<INT64, Numerical type>>
where
Numerical typeisBIGNUMERIC,FLOAT64,INT64orNUMERIC.For exampleARRAY<STRUCT<INT64, BIGNUMERIC>>.When a vector is expressed as
ARRAY<Numerical type>, each elementof the array denotes one dimension of the vector. An example of afour-dimensional vector is[0.0, 1.0, 1.0, 0.0].When a vector is expressed as
ARRAY<STRUCT<STRING, Numerical type>>orARRAY<STRUCT<INT64, Numerical type>>, eachSTRUCTarray itemdenotes one dimension of the vector. An example of a three-dimensionalvector is[("a", 0.0), ("b", 1.0), ("c", 1.0)].The initial
INT64orSTRINGvalue in theSTRUCTis used as anidentifier to match theSTRUCTvalues invector2. The ordering of datain the array doesn't matter; the values are matched by the identifier ratherthan by their position in the array. If either vector has anySTRUCTvalues with duplicate identifiers, running this function returns an error.vector2: anARRAYvalue that represents the second vector.vector2must have the same type asvector1.For example, if
vector1is anARRAY<STRUCT<STRING, FLOAT64>>column with three elements, like[("a", 0.0), ("b", 1.0), ("c", 1.0)], thenvector2must also be anARRAY<STRUCT<STRING, FLOAT64>>column.When
vector1andvector2areARRAY<Numerical type>columns,they must have the same array length.type: aSTRINGvalue that specifies the type of distance to calculate.Valid values areEUCLIDEAN,MANHATTAN, andCOSINE.If this argument isn't specified, the default value isEUCLIDEAN.
Output
ML.DISTANCE returns aFLOAT64 value that represents the distance betweenthe vectors. ReturnsNULL if eithervector1 orvector2 isNULL.
Example
Get the Euclidean distance for two tensors ofARRAY<FLOAT64> values:
Create the table
t1:CREATETABLEmydataset.t1(v1ARRAY<FLOAT64>,v2ARRAY<FLOAT64>)
Populate
t1:INSERTmydataset.t1(v1,v2)VALUES([4.1,0.5,1.0],[3.0,0.0,2.5])
Calculate the Euclidean norm for
v1andv2:SELECTv1,v2,ML.DISTANCE(v1,v2,'EUCLIDEAN')ASoutputFROMmydataset.t1
This query produces the following output:
+---------------+---------------+-------------------+|v1|v2|output|+---------------+---------------+-------------------||[4.1,0.5,1.0]|[3.0,0.0,2.5]|1.926136028425822|+------------+------------------+-------------------+
What's next
- For information about the supported SQL statements and functions for eachmodel type, seeEnd-to-end user journey for each model.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.