The ML.DISTANCE function

This document describes theML.DISTANCE scalar function, which lets youcompute the distance between two vectors.

Note: TheVECTOR_SEARCH functionis another vector function that calculates the distance between vectors. Youshould use theVECTOR_SEARCH function if you need to search a dataset forvectors similar to an input vector. You should use theML.DISTANCE functionif you need to compare two specific vectors to determine the distance betweenthem.

Syntax

ML.DISTANCE(vector1, vector2 [, type])

Arguments

ML.DISTANCE has the following arguments:

  • vector1: anARRAY value that represents the first vector, in one of thefollowing forms:

    • ARRAY<Numerical type>
    • ARRAY<STRUCT<STRING, Numerical type>>
    • ARRAY<STRUCT<INT64, Numerical type>>

    whereNumerical type isBIGNUMERIC,FLOAT64,INT64 orNUMERIC.For exampleARRAY<STRUCT<INT64, BIGNUMERIC>>.

    When a vector is expressed asARRAY<Numerical type>, each elementof the array denotes one dimension of the vector. An example of afour-dimensional vector is[0.0, 1.0, 1.0, 0.0].

    When a vector is expressed asARRAY<STRUCT<STRING, Numerical type>> orARRAY<STRUCT<INT64, Numerical type>>, eachSTRUCT array itemdenotes one dimension of the vector. An example of a three-dimensionalvector is[("a", 0.0), ("b", 1.0), ("c", 1.0)].

    The initialINT64 orSTRING value in theSTRUCT is used as anidentifier to match theSTRUCT values invector2. The ordering of datain the array doesn't matter; the values are matched by the identifier ratherthan by their position in the array. If either vector has anySTRUCTvalues with duplicate identifiers, running this function returns an error.

  • vector2: anARRAY value that represents the second vector.

    vector2 must have the same type asvector1.

    For example, ifvector1is anARRAY<STRUCT<STRING, FLOAT64>> column with three elements, like[("a", 0.0), ("b", 1.0), ("c", 1.0)], thenvector2 must also be anARRAY<STRUCT<STRING, FLOAT64>> column.

    Whenvector1 andvector2 areARRAY<Numerical type> columns,they must have the same array length.

  • type: aSTRING value that specifies the type of distance to calculate.Valid values areEUCLIDEAN,MANHATTAN, andCOSINE.If this argument isn't specified, the default value isEUCLIDEAN.

Output

ML.DISTANCE returns aFLOAT64 value that represents the distance betweenthe vectors. ReturnsNULL if eithervector1 orvector2 isNULL.

Example

Get the Euclidean distance for two tensors ofARRAY<FLOAT64> values:

  1. Create the tablet1:

    CREATETABLEmydataset.t1(v1ARRAY<FLOAT64>,v2ARRAY<FLOAT64>)
  2. Populatet1:

    INSERTmydataset.t1(v1,v2)VALUES([4.1,0.5,1.0],[3.0,0.0,2.5])
  3. Calculate the Euclidean norm forv1 andv2:

    SELECTv1,v2,ML.DISTANCE(v1,v2,'EUCLIDEAN')ASoutputFROMmydataset.t1

    This query produces the following output:

    +---------------+---------------+-------------------+|v1|v2|output|+---------------+---------------+-------------------||[4.1,0.5,1.0]|[3.0,0.0,2.5]|1.926136028425822|+------------+------------------+-------------------+

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.