Approximate aggregate functions

GoogleSQL for BigQuery supports approximate aggregate functions.To learn about the syntax for aggregate function calls, seeAggregate function calls.

Approximate aggregate functions are scalable in terms of memory usage and time,but produce approximate results instead of exact results. These functionstypically require less memory thanexact aggregation functionslikeCOUNT(DISTINCT ...), but also introduce statistical uncertainty.This makes approximate aggregation appropriate for large data streams forwhich linear memory usage is impractical, as well as for data that isalready approximate.

The approximate aggregate functions in this section work directly on theinput data, rather than an intermediate estimation of the data. These functionsdon't allow users to specify the precision for the estimation withsketches. If you would like to specify precision with sketches, see:

Function list

NameSummary
APPROX_COUNT_DISTINCT Gets the approximate result forCOUNT(DISTINCT expression).
APPROX_QUANTILES Gets the approximate quantile boundaries.
APPROX_TOP_COUNT Gets the approximate top elements and their approximate count.
APPROX_TOP_SUM Gets the approximate top elements and sum, based on the approximate sum of an assigned weight.

APPROX_COUNT_DISTINCT

APPROX_COUNT_DISTINCT(expression)

Description

Returns the approximate result forCOUNT(DISTINCT expression). The valuereturned is a statistical estimate, not necessarily the actual value.

This function is less accurate thanCOUNT(DISTINCT expression), but performsbetter on huge input.

Supported Argument Types

Any data typeexcept:

  • ARRAY
  • STRUCT
  • INTERVAL

Returned Data Types

INT64

Examples

SELECTAPPROX_COUNT_DISTINCT(x)asapprox_distinctFROMUNNEST([0,1,1,2,3,5])asx;/*-----------------+ | approx_distinct | +-----------------+ | 5               | +-----------------*/

APPROX_QUANTILES

APPROX_QUANTILES([DISTINCT]expression,number[{IGNORE|RESPECT}NULLS])

Description

Returns the approximate boundaries for a group ofexpression values, wherenumber represents the number of quantiles to create. This function returns anarray ofnumber + 1 elements, sorted in ascending order, where thefirst element is the approximate minimum and the last element is the approximatemaximum.

ReturnsNULL if there are zero input rows orexpression evaluates toNULL for all rows.

To learn more about the optional aggregate clauses that you can passinto this function, seeAggregate function calls.

Supported Argument Types

  • expression: Any supported data typeexcept:

    • ARRAY
    • STRUCT
    • INTERVAL
  • number:INT64 literal or query parameter.

Returned Data Types

ARRAY<T> whereT is the type specified byexpression.

Examples

SELECTAPPROX_QUANTILES(x,2)ASapprox_quantilesFROMUNNEST([1,1,1,4,5,6,7,8,9,10])ASx;/*------------------+ | approx_quantiles | +------------------+ | [1, 5, 10]       | +------------------*/
SELECTAPPROX_QUANTILES(x,100)[OFFSET(90)]ASpercentile_90FROMUNNEST([1,2,3,4,5,6,7,8,9,10])ASx;/*---------------+ | percentile_90 | +---------------+ | 9             | +---------------*/
SELECTAPPROX_QUANTILES(DISTINCTx,2)ASapprox_quantilesFROMUNNEST([1,1,1,4,5,6,7,8,9,10])ASx;/*------------------+ | approx_quantiles | +------------------+ | [1, 6, 10]       | +------------------*/
SELECTFORMAT("%T",APPROX_QUANTILES(x,2RESPECTNULLS))ASapprox_quantilesFROMUNNEST([NULL,NULL,1,1,1,4,5,6,7,8,9,10])ASx;/*------------------+ | approx_quantiles | +------------------+ | [NULL, 4, 10]    | +------------------*/
SELECTFORMAT("%T",APPROX_QUANTILES(DISTINCTx,2RESPECTNULLS))ASapprox_quantilesFROMUNNEST([NULL,NULL,1,1,1,4,5,6,7,8,9,10])ASx;/*------------------+ | approx_quantiles | +------------------+ | [NULL, 6, 10]    | +------------------*/

APPROX_TOP_COUNT

APPROX_TOP_COUNT(expression,number)

Description

Returns the approximate top elements ofexpression as an array ofSTRUCTs.Thenumber parameter specifies the number of elements returned.

EachSTRUCT contains two fields. The first field (namedvalue) contains aninput value. The second field (namedcount) contains anINT64 specifying thenumber of times the value was returned.

ReturnsNULL if there are zero input rows.

To learn more about the optional aggregate clauses that you can passinto this function, seeAggregate function calls.

Supported Argument Types

  • expression: Any data type that theGROUP BY clause supports.
  • number:INT64 literal or query parameter.

Returned Data Types

ARRAY<STRUCT>

Examples

SELECTAPPROX_TOP_COUNT(x,2)asapprox_top_countFROMUNNEST(["apple","apple","pear","pear","pear","banana"])asx;/*-------------------------+ | approx_top_count        | +-------------------------+ | [{pear, 3}, {apple, 2}] | +-------------------------*/

NULL handling

APPROX_TOP_COUNT doesn't ignoreNULLs in the input. For example:

SELECTAPPROX_TOP_COUNT(x,2)asapprox_top_countFROMUNNEST([NULL,"pear","pear","pear","apple",NULL])asx;/*------------------------+ | approx_top_count       | +------------------------+ | [{pear, 3}, {NULL, 2}] | +------------------------*/

APPROX_TOP_SUM

APPROX_TOP_SUM(expression,weight,number)

Description

Returns the approximate top elements ofexpression, ordered by the sum of theweight values provided for each unique value ofexpression. Thenumberparameter specifies the number of elements returned.

If theweight input is negative orNaN, this function returns an error.

The elements are returned as an array ofSTRUCTs.EachSTRUCT contains two fields:value andsum.Thevalue field contains the value of the input expression. Thesum field isthe same type asweight, and is the approximate sum of the input weightassociated with thevalue field.

ReturnsNULL if there are zero input rows.

To learn more about the optional aggregate clauses that you can passinto this function, seeAggregate function calls.

Supported Argument Types

  • expression: Any data type that theGROUP BY clause supports.
  • weight: One of the following:

    • INT64
    • NUMERIC
    • BIGNUMERIC
    • FLOAT64
  • number:INT64 literal or query parameter.

Returned Data Types

ARRAY<STRUCT>

Examples

SELECTAPPROX_TOP_SUM(x,weight,2)ASapprox_top_sumFROMUNNEST([STRUCT("apple"ASx,3ASweight),("pear",2),("apple",0),("banana",5),("pear",4)]);/*--------------------------+ | approx_top_sum           | +--------------------------+ | [{pear, 6}, {banana, 5}] | +--------------------------*/

NULL handling

APPROX_TOP_SUM doesn't ignoreNULL values for theexpression andweightparameters.

SELECTAPPROX_TOP_SUM(x,weight,2)ASapprox_top_sumFROMUNNEST([STRUCT("apple"ASx,NULLASweight),("pear",0),("pear",NULL)]);/*----------------------------+ | approx_top_sum             | +----------------------------+ | [{pear, 0}, {apple, NULL}] | +----------------------------*/
SELECTAPPROX_TOP_SUM(x,weight,2)ASapprox_top_sumFROMUNNEST([STRUCT("apple"ASx,0ASweight),(NULL,2)]);/*-------------------------+ | approx_top_sum          | +-------------------------+ | [{NULL, 2}, {apple, 0}] | +-------------------------*/
SELECTAPPROX_TOP_SUM(x,weight,2)ASapprox_top_sumFROMUNNEST([STRUCT("apple"ASx,0ASweight),(NULL,NULL)]);/*----------------------------+ | approx_top_sum             | +----------------------------+ | [{apple, 0}, {NULL, NULL}] | +----------------------------*/

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.