Approximate aggregate functions Stay organized with collections Save and categorize content based on your preferences.
GoogleSQL for BigQuery supports approximate aggregate functions.To learn about the syntax for aggregate function calls, seeAggregate function calls.
Approximate aggregate functions are scalable in terms of memory usage and time,but produce approximate results instead of exact results. These functionstypically require less memory thanexact aggregation functionslikeCOUNT(DISTINCT ...), but also introduce statistical uncertainty.This makes approximate aggregation appropriate for large data streams forwhich linear memory usage is impractical, as well as for data that isalready approximate.
The approximate aggregate functions in this section work directly on theinput data, rather than an intermediate estimation of the data. These functionsdon't allow users to specify the precision for the estimation withsketches. If you would like to specify precision with sketches, see:
- HyperLogLog++ functions to estimate cardinality.
- KLL functions to estimate quantile values.
Function list
| Name | Summary |
|---|---|
APPROX_COUNT_DISTINCT | Gets the approximate result forCOUNT(DISTINCT expression). |
APPROX_QUANTILES | Gets the approximate quantile boundaries. |
APPROX_TOP_COUNT | Gets the approximate top elements and their approximate count. |
APPROX_TOP_SUM | Gets the approximate top elements and sum, based on the approximate sum of an assigned weight. |
APPROX_COUNT_DISTINCT
APPROX_COUNT_DISTINCT(expression)Description
Returns the approximate result forCOUNT(DISTINCT expression). The valuereturned is a statistical estimate, not necessarily the actual value.
This function is less accurate thanCOUNT(DISTINCT expression), but performsbetter on huge input.
Supported Argument Types
Any data typeexcept:
ARRAYSTRUCTINTERVAL
Returned Data Types
INT64
Examples
SELECTAPPROX_COUNT_DISTINCT(x)asapprox_distinctFROMUNNEST([0,1,1,2,3,5])asx;/*-----------------+ | approx_distinct | +-----------------+ | 5 | +-----------------*/APPROX_QUANTILES
APPROX_QUANTILES([DISTINCT]expression,number[{IGNORE|RESPECT}NULLS])Description
Returns the approximate boundaries for a group ofexpression values, wherenumber represents the number of quantiles to create. This function returns anarray ofnumber + 1 elements, sorted in ascending order, where thefirst element is the approximate minimum and the last element is the approximatemaximum.
ReturnsNULL if there are zero input rows orexpression evaluates toNULL for all rows.
To learn more about the optional aggregate clauses that you can passinto this function, seeAggregate function calls.
Supported Argument Types
expression: Any supported data typeexcept:ARRAYSTRUCTINTERVAL
number:INT64literal or query parameter.
Returned Data Types
ARRAY<T> whereT is the type specified byexpression.
Examples
SELECTAPPROX_QUANTILES(x,2)ASapprox_quantilesFROMUNNEST([1,1,1,4,5,6,7,8,9,10])ASx;/*------------------+ | approx_quantiles | +------------------+ | [1, 5, 10] | +------------------*/SELECTAPPROX_QUANTILES(x,100)[OFFSET(90)]ASpercentile_90FROMUNNEST([1,2,3,4,5,6,7,8,9,10])ASx;/*---------------+ | percentile_90 | +---------------+ | 9 | +---------------*/SELECTAPPROX_QUANTILES(DISTINCTx,2)ASapprox_quantilesFROMUNNEST([1,1,1,4,5,6,7,8,9,10])ASx;/*------------------+ | approx_quantiles | +------------------+ | [1, 6, 10] | +------------------*/SELECTFORMAT("%T",APPROX_QUANTILES(x,2RESPECTNULLS))ASapprox_quantilesFROMUNNEST([NULL,NULL,1,1,1,4,5,6,7,8,9,10])ASx;/*------------------+ | approx_quantiles | +------------------+ | [NULL, 4, 10] | +------------------*/SELECTFORMAT("%T",APPROX_QUANTILES(DISTINCTx,2RESPECTNULLS))ASapprox_quantilesFROMUNNEST([NULL,NULL,1,1,1,4,5,6,7,8,9,10])ASx;/*------------------+ | approx_quantiles | +------------------+ | [NULL, 6, 10] | +------------------*/APPROX_TOP_COUNT
APPROX_TOP_COUNT(expression,number)Description
Returns the approximate top elements ofexpression as an array ofSTRUCTs.Thenumber parameter specifies the number of elements returned.
EachSTRUCT contains two fields. The first field (namedvalue) contains aninput value. The second field (namedcount) contains anINT64 specifying thenumber of times the value was returned.
ReturnsNULL if there are zero input rows.
To learn more about the optional aggregate clauses that you can passinto this function, seeAggregate function calls.
Supported Argument Types
expression: Any data type that theGROUP BYclause supports.number:INT64literal or query parameter.
Returned Data Types
ARRAY<STRUCT>
Examples
SELECTAPPROX_TOP_COUNT(x,2)asapprox_top_countFROMUNNEST(["apple","apple","pear","pear","pear","banana"])asx;/*-------------------------+ | approx_top_count | +-------------------------+ | [{pear, 3}, {apple, 2}] | +-------------------------*/NULL handling
APPROX_TOP_COUNT doesn't ignoreNULLs in the input. For example:
SELECTAPPROX_TOP_COUNT(x,2)asapprox_top_countFROMUNNEST([NULL,"pear","pear","pear","apple",NULL])asx;/*------------------------+ | approx_top_count | +------------------------+ | [{pear, 3}, {NULL, 2}] | +------------------------*/APPROX_TOP_SUM
APPROX_TOP_SUM(expression,weight,number)Description
Returns the approximate top elements ofexpression, ordered by the sum of theweight values provided for each unique value ofexpression. Thenumberparameter specifies the number of elements returned.
If theweight input is negative orNaN, this function returns an error.
The elements are returned as an array ofSTRUCTs.EachSTRUCT contains two fields:value andsum.Thevalue field contains the value of the input expression. Thesum field isthe same type asweight, and is the approximate sum of the input weightassociated with thevalue field.
ReturnsNULL if there are zero input rows.
To learn more about the optional aggregate clauses that you can passinto this function, seeAggregate function calls.
Supported Argument Types
expression: Any data type that theGROUP BYclause supports.weight: One of the following:INT64NUMERICBIGNUMERICFLOAT64
number:INT64literal or query parameter.
Returned Data Types
ARRAY<STRUCT>
Examples
SELECTAPPROX_TOP_SUM(x,weight,2)ASapprox_top_sumFROMUNNEST([STRUCT("apple"ASx,3ASweight),("pear",2),("apple",0),("banana",5),("pear",4)]);/*--------------------------+ | approx_top_sum | +--------------------------+ | [{pear, 6}, {banana, 5}] | +--------------------------*/NULL handling
APPROX_TOP_SUM doesn't ignoreNULL values for theexpression andweightparameters.
SELECTAPPROX_TOP_SUM(x,weight,2)ASapprox_top_sumFROMUNNEST([STRUCT("apple"ASx,NULLASweight),("pear",0),("pear",NULL)]);/*----------------------------+ | approx_top_sum | +----------------------------+ | [{pear, 0}, {apple, NULL}] | +----------------------------*/SELECTAPPROX_TOP_SUM(x,weight,2)ASapprox_top_sumFROMUNNEST([STRUCT("apple"ASx,0ASweight),(NULL,2)]);/*-------------------------+ | approx_top_sum | +-------------------------+ | [{NULL, 2}, {apple, 0}] | +-------------------------*/SELECTAPPROX_TOP_SUM(x,weight,2)ASapprox_top_sumFROMUNNEST([STRUCT("apple"ASx,0ASweight),(NULL,NULL)]);/*----------------------------+ | approx_top_sum | +----------------------------+ | [{apple, 0}, {NULL, NULL}] | +----------------------------*/Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.