User-defined aggregate functions

This document describes how to create, call, and deleteuser-defined aggregate functions (UDAFs) in BigQuery.

A UDAF lets you create an aggregate function by using an expression thatcontains code. A UDAF accepts columns of input, performs a calculation on agroup of rows at a time, and then returns the result of that calculation as asingle value.

Create a SQL UDAF

This section describes the various ways that you can create a persistent ortemporary SQL UDAF in BigQuery.

Create a persistent SQL UDAF

You can create a SQL UDAF that is persistent, meaning that you can reusethe UDAF across multiple queries. Persistent UDAFs are safeto call when they are shared between owners. UDAFs can't mutate data, talk toexternal systems, or send logs to Google Cloud Observability or similar applications.

To create a persistent UDAF, use theCREATE AGGREGATE FUNCTION statementwithout theTEMP orTEMPORARY keyword. You must include the dataset in thefunction path.

For example, the following query creates a persistent UDAF that's calledScaledAverage:

CREATEAGGREGATEFUNCTIONmyproject.mydataset.ScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS(AVG(dividend/divisor));

Create a temporary SQL UDAF

You can create a SQL UDAF that is temporary, meaning that the UDAF onlyexists in the scope of a single query, script, session, or procedure.

To create a temporary UDAF, use theCREATE AGGREGATE FUNCTION statement with theTEMPorTEMPORARY keyword.

For example, the following query creates a temporary UDAF that's calledScaledAverage:

CREATETEMPAGGREGATEFUNCTIONScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS(AVG(dividend/divisor));

Use aggregate and non-aggregate parameters

You can create a SQL UDAF that has both aggregate and non-aggregate parameters.

UDAFs normally aggregate function parameters across all rows in agroup.However, you can specify a function parameter as non-aggregate with theNOT AGGREGATE keyword.

A non-aggregate function parameter is a scalar function parameter with aconstant value for all rows in a group. A valid non-aggregate function parametermust be a literal. Inside the UDAF definition, aggregate function parameters canonly appear as function arguments to aggregate function calls. References tonon-aggregate function parameters can appear anywhere in the UDAF definition.

For example, the following function contains an aggregate parameter that'scalleddividend, and a non-aggregate parameter calleddivisor:

-- Create the function.CREATETEMPAGGREGATEFUNCTIONScaledSum(dividendFLOAT64,divisorFLOAT64NOTAGGREGATE)RETURNSFLOAT64AS(SUM(dividend)/divisor);

Use the default project in the function body

In the body of a SQL UDAF, any references to BigQuery entities,such as tables or views, must include the project ID unless the entity residesin the same project that contains the UDAF.

For example, consider the following statement:

CREATEAGGREGATEFUNCTIONproject1.dataset_a.ScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS((SELECTAVG(dividend/divisor)FROMdataset_a.my_table));

If you run the preceding statement in theproject1 project, the statementsucceeds becausemy_table exists inproject1. However, if you runthe preceding statement from a different project, the statement fails.To correct the error, include the project ID in the table reference:

CREATEAGGREGATEFUNCTIONproject1.dataset_a.ScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS((SELECTAVG(dividend/divisor)FROMproject1.dataset_a.my_table));

You can also reference an entity in a different project or dataset from theone where you create the function:

CREATEAGGREGATEFUNCTIONproject1.dataset_a.ScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS((SELECTAVG(dividend/divisor)FROMproject2.dataset_c.my_table));

Create a JavaScript UDAF

This section describes the various ways in which you can create aJavaScript UDAF in BigQuery. There are a few rules to observewhen creating a JavaScript UDAF:

  • The body of the JavaScript UDAF must be a quoted string literalthat represents the JavaScript code.To learn more about the different types of quoted string literals that youcan use, seeFormats for quoted literals.

  • Only certain type encodings are allowed. For more information,seePermitted SQL type encodings in a JavaScript UDAF.

  • The JavaScript function body must include four JavaScript functions thatinitialize, aggregate, merge, and finalize the results for theJavaScript UDAF (initialState,aggregate,merge, andfinalize). Formore information, seePermitted SQL type encodings in a JavaScript UDAF.

  • Any value returned by theinitialState function or that is left in thestate argument after theaggregate ormerge function is called,must be serializable. If you want to work withnon-serializable aggregation data, such as functions or symbol fields, youmust use the includedserialize anddeserialize functions. To learnmore, seeSerialize and deserialize data in a JavaScript UDAF.

Create a persistent JavaScript UDAF

You can create a JavaScript UDAF that is persistent, meaning that you can reusethe UDAF across multiple queries. Persistent UDAFs are safeto call when they are shared between owners. UDAFs can't mutate data, talk toexternal systems, or send logs to Google Cloud Observability or similar applications.

To create a persistent UDAF, use theCREATE AGGREGATE FUNCTION statement without theTEMP orTEMPORARY keyword. You must include the dataset in thefunction path.

The following query creates a persistent JavaScript UDAF that's calledSumPositive:

CREATEORREPLACEAGGREGATEFUNCTIONmy_project.my_dataset.SumPositive(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr'''  export function initialState() {    return {sum: 0}  }  export function aggregate(state, x) {    if (x > 0) {      state.sum += x;    }  }  export function merge(state, partialState) {    state.sum += partialState.sum;  }  export function finalize(state) {    return state.sum;  }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([1.0,-1.0,3.0,-3.0,5.0,-5.0])ASx)SELECTmy_project.my_dataset.SumPositive(x)ASsumFROMnumbers;/*-----* | sum | +-----+ | 9.0 | *-----*/

Create a temporary JavaScript UDAF

You can create a JavaScript UDAF that is temporary, meaning that the UDAF onlyexists in the scope of a single query, script, session, or procedure.

To create a temporary UDAF, use theCREATE AGGREGATE FUNCTION statement with theTEMPorTEMPORARY keyword.

The following query creates a temporary JavaScript UDAF that's calledSumPositive:

CREATETEMPAGGREGATEFUNCTIONSumPositive(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr'''  export function initialState() {    return {sum: 0}  }  export function aggregate(state, x) {    if (x > 0) {      state.sum += x;    }  }  export function merge(state, partialState) {    state.sum += partialState.sum;  }  export function finalize(state) {    return state.sum;  }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([1.0,-1.0,3.0,-3.0,5.0,-5.0])ASx)SELECTSumPositive(x)ASsumFROMnumbers;/*-----* | sum | +-----+ | 9.0 | *-----*/

Include non-aggregate parameters in a JavaScript UDAF

You can create a JavaScript UDAF that has both aggregate andnon-aggregate parameters.

UDAFs normally aggregate function parameters across all rows in agroup.However, you can specify a function parameter as non-aggregate with theNOT AGGREGATE keyword.

A non-aggregate function parameter is a scalar function parameter with aconstant value for all rows in a group. A valid non-aggregate function parametermust be a literal. Inside the UDAF definition, aggregate function parameters canonly appear as function arguments to aggregate function calls. References tonon-aggregate function parameters can appear anywhere in the UDAF definition.

In the following example, the JavaScript UDAF contains an aggregateparameter calleds and a non-aggregate parameter calleddelimiter:

CREATETEMPAGGREGATEFUNCTIONJsStringAgg(sSTRING,delimiterSTRINGNOTAGGREGATE)RETURNSSTRINGLANGUAGEjsASr'''  export function initialState() {    return {strings: []}  }  export function aggregate(state, s) {    state.strings.push(s);  }  export function merge(state, partialState) {    state.strings = state.strings.concat(partialState.strings);  }  export function finalize(state, delimiter) {    return state.strings.join(delimiter);  }''';-- Call the JavaScript UDAF.WITHstringsAS(SELECT*FROMUNNEST(["aaa","bbb","ccc","ddd"])ASvalues)SELECTJsStringAgg(values,'.')ASresultFROMstrings;/*-----------------* | result          | +-----------------+ | aaa.bbb.ccc.ddd | *-----------------*/

Serialize and deserialize data in a JavaScript UDAF

BigQuery must serializeany object returned by theinitialState function or that is left in thestate argument after theaggregate ormerge function is called.BigQuery supportsserializing an object if all fields are one of the following:

  • A JavaScript primitive value (for example:2,"abc",null,undefined).
  • A JavaScript object for which BigQuery supports serializing allfield values.
  • A JavaScript array for which BigQuery supports serializing allelements.

The following return values are serializable:

exportfunctioninitialState(){return{a:"",b:3,c:null,d:{x:23}}}
exportfunctioninitialState(){return{value:2.3};}

The following return values are not serializable:

exportfunctioninitialState(){return{value:function(){return6;}}}
exportfunctioninitialState(){return2.3;}

If you want to work with non-serializable aggregation states,the JavaScript UDAF must include theserialize anddeserialize functions.Theserialize function converts the aggregation state into a serializableobject; thedeserialize function converts the serializable object back intoan aggregation state.

In the following example, an external library calculates sums by using aninterface:

exportclassSumAggregator{constructor(){this.sum=0;}update(value){this.sum+=value;}getSum(){returnthis.sum;}}

The following query doesn't execute because theSumAggregator class object isnot BigQuery-serializable, due to the presence of functionsinside of the class.

CREATETEMPAGGREGATEFUNCTIONF(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr'''  class SumAggregator {   constructor() {     this.sum = 0;   }   update(value) {     this.sum += value;   }   getSum() {     return this.sum;   }  }  export function initialState() {   return new SumAggregator();  }  export function aggregate(agg, value) {   agg.update(value);  }  export function merge(agg1, agg2) {   agg1.update(agg2.getSum());  }  export function finalize(agg) {   return agg.getSum();  }''';--Error:getSumisnotafunctionSELECTF(x)ASresultsFROMUNNEST([1,2,3,4])ASx;

If you add theserialize anddeserialize functions to the preceding query,the query runs because theSumAggregator class object is converted to anobject that is BigQuery-serializable and then back to aSumAggregator class object again.

CREATETEMPAGGREGATEFUNCTIONF(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr'''  class SumAggregator {   constructor() {     this.sum = 0;   }   update(value) {     this.sum += value;   }   getSum() {     return this.sum;   }  }  export function initialState() {   return new SumAggregator();  }  export function aggregate(agg, value) {   agg.update(value);  }  export function merge(agg1, agg2) {   agg1.update(agg2.getSum());  }  export function finalize(agg) {   return agg.getSum();  }  export function serialize(agg) {   return {sum: agg.getSum()};  }  export function deserialize(serialized) {   var agg = new SumAggregator();   agg.update(serialized.sum);   return agg;  }''';SELECTF(x)ASresultsFROMUNNEST([1,2,3,4])ASx;/*-----------------* | results         | +-----------------+ | 10.0            | *-----------------*/

To learn more about the serialization functions, seeOptional JavaScript serialization functions.

Include global variables and custom functions in a JavaScript UDAF

The JavaScript function body can include custom JavaScript code suchas JavaScript global variables and custom functions.

Global variables are executed when the JavaScript is loadedinto BigQuery and before theinitialState function is executed.Global variables might be useful if you need to perform one-time initializationwork that shouldn't repeat for each aggregation group, as would be the case withtheinitialState,aggregate,merge, andfinalize functions.

Don't use global variables to store aggregation state.Instead, limit aggregation state to objects passed to exported functions. Onlyuse global variables to cache expensive operations that are not specific to anyparticular aggregation operation.

In the following query, theSumOfPrimes function calculates a sum, but onlyprime numbers are included in the calculation. In the JavaScript function body,there are two global variables,primes andmaxTested, that are initializedfirst. In addition, there is a custom function calledisPrime that checks ifa number is prime.

CREATETEMPAGGREGATEFUNCTIONSumOfPrimes(xINT64)RETURNSINT64LANGUAGEjsASr'''  var primes = new Set([2]);  var maxTested = 2;  function isPrime(n) {    if (primes.has(n)) {      return true;    }    if (n <= maxTested) {      return false;    }    for (var k = 2; k< n; ++k) {      if (!isPrime(k)) {        continue;      }      if ((n % k) == 0) {        maxTested = n;        return false;      }    }    maxTested = n;    primes.add(n);    return true;  }  export function initialState() {    return {sum: 0};  }  export function aggregate(state, x) {    x = Number(x);    if (isPrime(x)) {      state.sum += x;    }  }  export function merge(state, partialState) {    state.sum += partialState.sum;  }  export function finalize(state) {    return state.sum;  }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([10,11,13,17,19,20])ASx)SELECTSumOfPrimes(x)ASsumFROMnumbers;/*-----* | sum | +-----+ | 60  | *-----*/

Include JavaScript libraries

You can extend your JavaScript UDAFs with thelibrary option in theOPTIONS clause. This option lets you specify external code libraries for theJavaScript UDAF, and then import those libraries with theimport declaration.

In the following example, code inbar.js is available to any code in thefunction body of the JavaScript UDAF:

CREATETEMPAGGREGATEFUNCTIONJsAggFn(xFLOAT64)RETURNSFLOAT64LANGUAGEjsOPTIONS(library=['gs://foo/bar.js'])ASr'''  import doInterestingStuff from 'bar.js';  export function initialState() {    return ...  }  export function aggregate(state, x) {    var result = doInterestingStuff(x);    ...  }  export function merge(state, partial_state) {    ...  }  export function finalize(state) {    return ...;  }''';

Required JavaScript structure

Unlike a JavaScript UDF, where the function body is free-form JavaScript thatruns for every row, the function body for a JavaScript UDAF is aJavaScript module that contains some built-in exported functions, which areinvoked at various stages in the aggregation process. Some of thesebuilt-in functions are required, while others are optional. You can also addyour JavaScript functions.

Required JavaScript aggregation functions

You can include your JavaScript functions, but the JavaScript function bodymust include the following exportable JavaScript functions:

  • initialState([nonAggregateParam]): returns a JavaScript object thatrepresents an aggregation state in which no rows have been aggregated yet.

  • aggregate(state, aggregateParam[, ...][, nonAggregateParam]): aggregates onerow of data, updating state to store the result of the aggregation. Does notreturn a value.

  • merge(state, partialState, [nonAggregateParam]): merges aggregationstatepartialState into aggregation statestate. This function is usedwhen the engine aggregates different sections of data in parallel and needsto combine the results. Does not return a value.

  • finalize(finalState, [nonAggregateParam]): returns the final resultof the aggregate function, given a final aggregation statefinalState.

To learn more about the required functions, seeRequired functions in a JavaScript UDAF.

Optional JavaScript serialization functions

If you want to work with non-serializable aggregation states, theJavaScript UDAF must provide theserialize anddeserialize functions.Theserialize function converts the aggregation state to aBigQuery-serializable object; thedeserialize function converts theBigQuery-serializable object back into an aggregation state.

  • serialize(state): returns a serializable object that contains theinformation in aggregation state, to be deserialized through thedeserialize function.

  • deserialize(serializedState): deserializesserializedState (previouslyserialized by theserialize function) into an aggregation state that canbe passed into theserialize,aggregate,merge, orfinalizefunctions.

To learn more about the built-in JavaScript serialization functions, seeSerialization functions for a JavaScript UDAF.

To learn how to serialize and deserialize data with a JavaScript UDAF, seeSerialize and deserialize data in a JavaScript UDAF.

Permitted SQL type encodings in a JavaScript UDAF

In JavaScript UDAFs, the following supportedGoogleSQL data typesrepresentJavaScript data typesas follows:

GoogleSQL
data type
JavaScript
data type
Notes
ARRAYArray An array of arrays is not supported. To get around this limitation, use theArray<Object<Array>> (JavaScript) andARRAY<STRUCT<ARRAY>> (GoogleSQL) data types.
BIGNUMERICNumber orString Same asNUMERIC.
BOOLBoolean
BYTESUint8Array
DATEDate
FLOAT64Number
INT64BigInt
JSONVarious types The GoogleSQLJSON data type can be converted into a JavaScriptObject,Array, or other GoogleSQL-supported JavaScript data type.
NUMERICNumber orString If aNUMERIC value can be represented exactly as anIEEE 754 floating-point value (range[-253, 253]), and has no fractional part, it is encoded as aNumber data type, otherwise it is encoded as aString data type.
STRINGString
STRUCTObject EachSTRUCT field is a named property in theObject data type. An unnamedSTRUCT field is not supported.
TIMESTAMPDateDate contains a microsecond field with the microsecond fraction ofTIMESTAMP.
Note: The SQL encodings for JavaScript UDAFs are different from those forJavaScript UDFs.

Call a UDAF

This section describes the various ways that you can call a persistent ortemporary UDAF after you create it in BigQuery.

Note: JavaScript UDAF calls don't support theORDER BY clause.

Call a persistent UDAF

You can call a persistent UDAF in the same way that you call abuilt-in aggregate function. For more information, seeAggregate function calls.You must include the dataset in the function path.

In the following example, the query calls a persistent UDAF that'scalledWeightedAverage:

SELECTmy_project.my_dataset.WeightedAverage(item,weight,2)ASweighted_averageFROM(SELECT1ASitem,2.45ASweightUNIONALLSELECT3ASitem,0.11ASweightUNIONALLSELECT5ASitem,7.02ASweight);

A table with the following results is produced:

/*------------------* | weighted_average | +------------------+ | 4.5              | *------------------*/

Call a temporary UDAF

You can call a temporary UDAF in the same way that you call abuilt-in aggregate function. For more information, seeAggregate function calls.

The temporary function must be included in amulti-statement query orprocedure that contains the UDAFfunction call.

In the following example, the query calls a temporary UDAF that'scalledWeightedAverage:

CREATETEMPAGGREGATEFUNCTIONWeightedAverage(...)-- Temporary UDAF function callSELECTWeightedAverage(item,weight,2)ASweighted_averageFROM(SELECT1ASitem,2.45ASweightUNIONALLSELECT3ASitem,0.11ASweightUNIONALLSELECT5ASitem,7.02ASweight);

A table with the following results is produced:

/*------------------* | weighted_average | +------------------+ | 4.5              | *------------------*/

Ignore or include rows withNULL values

When a JavaScript UDAF is called with theIGNORE NULLS argument,BigQuery automatically skips over rows for which any aggregateargument evaluates toNULL. Such rows are excluded from the aggregationcompletely and are not passed to the JavaScriptaggregate function. WhenRESPECT NULLS argument is provided,NULL filtration is disabled, and everyrow is passed to the JavaScript UDAF, regardless ofNULL values.

When neither theIGNORE NULLS norRESPECT NULLS argument is provided, thedefault argument isIGNORE NULLS.

The following example illustrates the defaultNULL behavior,theIGNORE NULLS behavior, and theRESPECT NULLS behavior:

CREATETEMPAGGREGATEFUNCTIONSumPositive(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr'''  export function initialState() {    return {sum: 0}  }  export function aggregate(state, x) {    if (x == null) {      // Use 1000 instead of 0 as placeholder for null so      // that NULL values passed are visible in the result.      state.sum += 1000;      return;    }    if (x > 0) {      state.sum += x;    }  }  export function merge(state, partialState) {    state.sum += partialState.sum;  }  export function finalize(state) {    return state.sum;  }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([1.0,2.0,NULL])ASx)SELECTSumPositive(x)ASsum,SumPositive(xIGNORENULLS)ASsum_ignore_nulls,SumPositive(xRESPECTNULLS)ASsum_respect_nullsFROMnumbers;/*-----+------------------+-------------------* | sum | sum_ignore_nulls | sum_respect_nulls | +-----+------------------+-------------------+ | 3.0 | 3.0              | 1003.0            | *-----+------------------+-------------------*/

Delete a UDAF

This section describes the various ways that you can delete a persistent ortemporary UDAF after you created it in BigQuery.

Delete a persistent UDAF

To delete a persistent UDAF, use theDROP FUNCTION statement.You must include the dataset in the function path.

In the following example, the query deletes a persistent UDAF that'scalledWeightedAverage:

DROPFUNCTIONIFEXISTSmy_project.my_dataset.WeightedAverage;

Delete a temporary UDAF

To delete a temporary UDAF, use theDROP FUNCTION statement.

In the following example, the query deletes a temporary UDAF that'scalledWeightedAverage:

DROPFUNCTIONIFEXISTSWeightedAverage;

A temporary UDAF expires as soon as the query finishes. The UDAFdoesn't need to be deleted unless you want to remove it early from amulti-statement query orprocedure.

List UDAFs

UDAFs are a type of routine. To list all of the routines in a dataset,seeList routines.

Performance tips

If you want to improve the performance of your queries, consider the following:

  • Prefilter your input. Processing data in JavaScript is more expensive thanin SQL, so it is best to filter the input as much as possible in SQL first.

    The following query is less efficient because it filters the input by usingx > 0 in the UDAF call:

    SELECTJsFunc(x)FROMt;

    The following query is more efficient because it prefilters the inputby usingWHERE x > 0 before the UDAF is called:

    SELECTJsFunc(x)FROMtWHEREx >0;
  • Use built-in aggregate functions instead of JavaScript when possible.Re-implementing a built-in aggregate function in JavaScript is slower thancalling a built-in aggregate function that does the same thing.

    The following query is less efficient because it implements a UDAF:

    SELECTSumSquare(x)FROMt;

    The following query is more efficient because it implements a built-infunction that produces the same results as the previous query:

    SELECTSUM(x*x)FROMt;
  • JavaScript UDAFs are appropriate for more complex aggregation operations,which can't be expressed through built-in functions.

  • Use memory efficiently. The JavaScript processing environment has limitedmemory available for each query. JavaScript UDAF queries that accumulate toomuch local state might fail due to memory exhaustion. Be especially mindfulabout minimizing the size of aggregation state objects and avoid aggregationstates which accumulate large numbers of rows.

    The following query is not efficient because theaggregate functionuses an unbounded amount of memory when the number of rows processedgets large.

    exportfunctioninitialState(){return{rows:[]};}exportfunctionaggregate(state,x){state.rows.push(x);}...
  • Use partitioned tables when possible. JavaScript UDAFs typically run moreefficiently when querying against a partitioned table compared to anon-partitioned table, because a partitioned table stores data in manysmaller files compared to a non-partitioned table, thus enabling higherparallelism.

Limitations

  • UDAFs have the same limitations that apply to UDFs. For details, seeUDF limitations.

  • Only literals, query parameters, and script variables can be passed in asnon-aggregate arguments for a UDAF.

  • The use of theORDER BY clause in a JavaScript UDAF function call isunsupported.

    SELECTMyUdaf(xORDERBYy)FROMt;-- Error: ORDER BY is unsupported.

Pricing

UDAFs are billed using the standardBigQuery pricing model.

Quotas and limits

UDAFs have the same quotas and limits that apply to UDFs. Forinformation about UDF quotas, seeQuotas and limits.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.