User-defined aggregate functions
This document describes how to create, call, and deleteuser-defined aggregate functions (UDAFs) in BigQuery.
A UDAF lets you create an aggregate function by using an expression thatcontains code. A UDAF accepts columns of input, performs a calculation on agroup of rows at a time, and then returns the result of that calculation as asingle value.
Create a SQL UDAF
This section describes the various ways that you can create a persistent ortemporary SQL UDAF in BigQuery.
Create a persistent SQL UDAF
You can create a SQL UDAF that is persistent, meaning that you can reusethe UDAF across multiple queries. Persistent UDAFs are safeto call when they are shared between owners. UDAFs can't mutate data, talk toexternal systems, or send logs to Google Cloud Observability or similar applications.
To create a persistent UDAF, use theCREATE AGGREGATE FUNCTION statementwithout theTEMP orTEMPORARY keyword. You must include the dataset in thefunction path.
For example, the following query creates a persistent UDAF that's calledScaledAverage:
CREATEAGGREGATEFUNCTIONmyproject.mydataset.ScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS(AVG(dividend/divisor));
Create a temporary SQL UDAF
You can create a SQL UDAF that is temporary, meaning that the UDAF onlyexists in the scope of a single query, script, session, or procedure.
To create a temporary UDAF, use theCREATE AGGREGATE FUNCTION statement with theTEMPorTEMPORARY keyword.
For example, the following query creates a temporary UDAF that's calledScaledAverage:
CREATETEMPAGGREGATEFUNCTIONScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS(AVG(dividend/divisor));
Use aggregate and non-aggregate parameters
You can create a SQL UDAF that has both aggregate and non-aggregate parameters.
UDAFs normally aggregate function parameters across all rows in agroup.However, you can specify a function parameter as non-aggregate with theNOT AGGREGATE keyword.
A non-aggregate function parameter is a scalar function parameter with aconstant value for all rows in a group. A valid non-aggregate function parametermust be a literal. Inside the UDAF definition, aggregate function parameters canonly appear as function arguments to aggregate function calls. References tonon-aggregate function parameters can appear anywhere in the UDAF definition.
For example, the following function contains an aggregate parameter that'scalleddividend, and a non-aggregate parameter calleddivisor:
-- Create the function.CREATETEMPAGGREGATEFUNCTIONScaledSum(dividendFLOAT64,divisorFLOAT64NOTAGGREGATE)RETURNSFLOAT64AS(SUM(dividend)/divisor);
Use the default project in the function body
In the body of a SQL UDAF, any references to BigQuery entities,such as tables or views, must include the project ID unless the entity residesin the same project that contains the UDAF.
For example, consider the following statement:
CREATEAGGREGATEFUNCTIONproject1.dataset_a.ScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS((SELECTAVG(dividend/divisor)FROMdataset_a.my_table));
If you run the preceding statement in theproject1 project, the statementsucceeds becausemy_table exists inproject1. However, if you runthe preceding statement from a different project, the statement fails.To correct the error, include the project ID in the table reference:
CREATEAGGREGATEFUNCTIONproject1.dataset_a.ScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS((SELECTAVG(dividend/divisor)FROMproject1.dataset_a.my_table));
You can also reference an entity in a different project or dataset from theone where you create the function:
CREATEAGGREGATEFUNCTIONproject1.dataset_a.ScaledAverage(dividendFLOAT64,divisorFLOAT64)RETURNSFLOAT64AS((SELECTAVG(dividend/divisor)FROMproject2.dataset_c.my_table));
Create a JavaScript UDAF
This section describes the various ways in which you can create aJavaScript UDAF in BigQuery. There are a few rules to observewhen creating a JavaScript UDAF:
The body of the JavaScript UDAF must be a quoted string literalthat represents the JavaScript code.To learn more about the different types of quoted string literals that youcan use, seeFormats for quoted literals.
Only certain type encodings are allowed. For more information,seePermitted SQL type encodings in a JavaScript UDAF.
The JavaScript function body must include four JavaScript functions thatinitialize, aggregate, merge, and finalize the results for theJavaScript UDAF (
initialState,aggregate,merge, andfinalize). Formore information, seePermitted SQL type encodings in a JavaScript UDAF.Any value returned by the
initialStatefunction or that is left in thestateargument after theaggregateormergefunction is called,must be serializable. If you want to work withnon-serializable aggregation data, such as functions or symbol fields, youmust use the includedserializeanddeserializefunctions. To learnmore, seeSerialize and deserialize data in a JavaScript UDAF.
Create a persistent JavaScript UDAF
You can create a JavaScript UDAF that is persistent, meaning that you can reusethe UDAF across multiple queries. Persistent UDAFs are safeto call when they are shared between owners. UDAFs can't mutate data, talk toexternal systems, or send logs to Google Cloud Observability or similar applications.
To create a persistent UDAF, use theCREATE AGGREGATE FUNCTION statement without theTEMP orTEMPORARY keyword. You must include the dataset in thefunction path.
The following query creates a persistent JavaScript UDAF that's calledSumPositive:
CREATEORREPLACEAGGREGATEFUNCTIONmy_project.my_dataset.SumPositive(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr''' export function initialState() { return {sum: 0} } export function aggregate(state, x) { if (x > 0) { state.sum += x; } } export function merge(state, partialState) { state.sum += partialState.sum; } export function finalize(state) { return state.sum; }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([1.0,-1.0,3.0,-3.0,5.0,-5.0])ASx)SELECTmy_project.my_dataset.SumPositive(x)ASsumFROMnumbers;/*-----* | sum | +-----+ | 9.0 | *-----*/
Create a temporary JavaScript UDAF
You can create a JavaScript UDAF that is temporary, meaning that the UDAF onlyexists in the scope of a single query, script, session, or procedure.
To create a temporary UDAF, use theCREATE AGGREGATE FUNCTION statement with theTEMPorTEMPORARY keyword.
The following query creates a temporary JavaScript UDAF that's calledSumPositive:
CREATETEMPAGGREGATEFUNCTIONSumPositive(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr''' export function initialState() { return {sum: 0} } export function aggregate(state, x) { if (x > 0) { state.sum += x; } } export function merge(state, partialState) { state.sum += partialState.sum; } export function finalize(state) { return state.sum; }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([1.0,-1.0,3.0,-3.0,5.0,-5.0])ASx)SELECTSumPositive(x)ASsumFROMnumbers;/*-----* | sum | +-----+ | 9.0 | *-----*/
Include non-aggregate parameters in a JavaScript UDAF
You can create a JavaScript UDAF that has both aggregate andnon-aggregate parameters.
UDAFs normally aggregate function parameters across all rows in agroup.However, you can specify a function parameter as non-aggregate with theNOT AGGREGATE keyword.
A non-aggregate function parameter is a scalar function parameter with aconstant value for all rows in a group. A valid non-aggregate function parametermust be a literal. Inside the UDAF definition, aggregate function parameters canonly appear as function arguments to aggregate function calls. References tonon-aggregate function parameters can appear anywhere in the UDAF definition.
In the following example, the JavaScript UDAF contains an aggregateparameter calleds and a non-aggregate parameter calleddelimiter:
CREATETEMPAGGREGATEFUNCTIONJsStringAgg(sSTRING,delimiterSTRINGNOTAGGREGATE)RETURNSSTRINGLANGUAGEjsASr''' export function initialState() { return {strings: []} } export function aggregate(state, s) { state.strings.push(s); } export function merge(state, partialState) { state.strings = state.strings.concat(partialState.strings); } export function finalize(state, delimiter) { return state.strings.join(delimiter); }''';-- Call the JavaScript UDAF.WITHstringsAS(SELECT*FROMUNNEST(["aaa","bbb","ccc","ddd"])ASvalues)SELECTJsStringAgg(values,'.')ASresultFROMstrings;/*-----------------* | result | +-----------------+ | aaa.bbb.ccc.ddd | *-----------------*/
Serialize and deserialize data in a JavaScript UDAF
BigQuery must serializeany object returned by theinitialState function or that is left in thestate argument after theaggregate ormerge function is called.BigQuery supportsserializing an object if all fields are one of the following:
- A JavaScript primitive value (for example:
2,"abc",null,undefined). - A JavaScript object for which BigQuery supports serializing allfield values.
- A JavaScript array for which BigQuery supports serializing allelements.
The following return values are serializable:
exportfunctioninitialState(){return{a:"",b:3,c:null,d:{x:23}}}exportfunctioninitialState(){return{value:2.3};}The following return values are not serializable:
exportfunctioninitialState(){return{value:function(){return6;}}}exportfunctioninitialState(){return2.3;}If you want to work with non-serializable aggregation states,the JavaScript UDAF must include theserialize anddeserialize functions.Theserialize function converts the aggregation state into a serializableobject; thedeserialize function converts the serializable object back intoan aggregation state.
In the following example, an external library calculates sums by using aninterface:
exportclassSumAggregator{constructor(){this.sum=0;}update(value){this.sum+=value;}getSum(){returnthis.sum;}}
The following query doesn't execute because theSumAggregator class object isnot BigQuery-serializable, due to the presence of functionsinside of the class.
CREATETEMPAGGREGATEFUNCTIONF(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr''' class SumAggregator { constructor() { this.sum = 0; } update(value) { this.sum += value; } getSum() { return this.sum; } } export function initialState() { return new SumAggregator(); } export function aggregate(agg, value) { agg.update(value); } export function merge(agg1, agg2) { agg1.update(agg2.getSum()); } export function finalize(agg) { return agg.getSum(); }''';--Error:getSumisnotafunctionSELECTF(x)ASresultsFROMUNNEST([1,2,3,4])ASx;
If you add theserialize anddeserialize functions to the preceding query,the query runs because theSumAggregator class object is converted to anobject that is BigQuery-serializable and then back to aSumAggregator class object again.
CREATETEMPAGGREGATEFUNCTIONF(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr''' class SumAggregator { constructor() { this.sum = 0; } update(value) { this.sum += value; } getSum() { return this.sum; } } export function initialState() { return new SumAggregator(); } export function aggregate(agg, value) { agg.update(value); } export function merge(agg1, agg2) { agg1.update(agg2.getSum()); } export function finalize(agg) { return agg.getSum(); } export function serialize(agg) { return {sum: agg.getSum()}; } export function deserialize(serialized) { var agg = new SumAggregator(); agg.update(serialized.sum); return agg; }''';SELECTF(x)ASresultsFROMUNNEST([1,2,3,4])ASx;/*-----------------* | results | +-----------------+ | 10.0 | *-----------------*/
To learn more about the serialization functions, seeOptional JavaScript serialization functions.
Include global variables and custom functions in a JavaScript UDAF
The JavaScript function body can include custom JavaScript code suchas JavaScript global variables and custom functions.
Global variables are executed when the JavaScript is loadedinto BigQuery and before theinitialState function is executed.Global variables might be useful if you need to perform one-time initializationwork that shouldn't repeat for each aggregation group, as would be the case withtheinitialState,aggregate,merge, andfinalize functions.
Don't use global variables to store aggregation state.Instead, limit aggregation state to objects passed to exported functions. Onlyuse global variables to cache expensive operations that are not specific to anyparticular aggregation operation.
In the following query, theSumOfPrimes function calculates a sum, but onlyprime numbers are included in the calculation. In the JavaScript function body,there are two global variables,primes andmaxTested, that are initializedfirst. In addition, there is a custom function calledisPrime that checks ifa number is prime.
CREATETEMPAGGREGATEFUNCTIONSumOfPrimes(xINT64)RETURNSINT64LANGUAGEjsASr''' var primes = new Set([2]); var maxTested = 2; function isPrime(n) { if (primes.has(n)) { return true; } if (n <= maxTested) { return false; } for (var k = 2; k< n; ++k) { if (!isPrime(k)) { continue; } if ((n % k) == 0) { maxTested = n; return false; } } maxTested = n; primes.add(n); return true; } export function initialState() { return {sum: 0}; } export function aggregate(state, x) { x = Number(x); if (isPrime(x)) { state.sum += x; } } export function merge(state, partialState) { state.sum += partialState.sum; } export function finalize(state) { return state.sum; }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([10,11,13,17,19,20])ASx)SELECTSumOfPrimes(x)ASsumFROMnumbers;/*-----* | sum | +-----+ | 60 | *-----*/
Include JavaScript libraries
You can extend your JavaScript UDAFs with thelibrary option in theOPTIONS clause. This option lets you specify external code libraries for theJavaScript UDAF, and then import those libraries with theimport declaration.
In the following example, code inbar.js is available to any code in thefunction body of the JavaScript UDAF:
CREATETEMPAGGREGATEFUNCTIONJsAggFn(xFLOAT64)RETURNSFLOAT64LANGUAGEjsOPTIONS(library=['gs://foo/bar.js'])ASr''' import doInterestingStuff from 'bar.js'; export function initialState() { return ... } export function aggregate(state, x) { var result = doInterestingStuff(x); ... } export function merge(state, partial_state) { ... } export function finalize(state) { return ...; }''';
Required JavaScript structure
Unlike a JavaScript UDF, where the function body is free-form JavaScript thatruns for every row, the function body for a JavaScript UDAF is aJavaScript module that contains some built-in exported functions, which areinvoked at various stages in the aggregation process. Some of thesebuilt-in functions are required, while others are optional. You can also addyour JavaScript functions.
Required JavaScript aggregation functions
You can include your JavaScript functions, but the JavaScript function bodymust include the following exportable JavaScript functions:
initialState([nonAggregateParam]): returns a JavaScript object thatrepresents an aggregation state in which no rows have been aggregated yet.aggregate(state, aggregateParam[, ...][, nonAggregateParam]): aggregates onerow of data, updating state to store the result of the aggregation. Does notreturn a value.merge(state, partialState, [nonAggregateParam]): merges aggregationstatepartialStateinto aggregation statestate. This function is usedwhen the engine aggregates different sections of data in parallel and needsto combine the results. Does not return a value.finalize(finalState, [nonAggregateParam]): returns the final resultof the aggregate function, given a final aggregation statefinalState.
To learn more about the required functions, seeRequired functions in a JavaScript UDAF.
Optional JavaScript serialization functions
If you want to work with non-serializable aggregation states, theJavaScript UDAF must provide theserialize anddeserialize functions.Theserialize function converts the aggregation state to aBigQuery-serializable object; thedeserialize function converts theBigQuery-serializable object back into an aggregation state.
serialize(state): returns a serializable object that contains theinformation in aggregation state, to be deserialized through thedeserializefunction.deserialize(serializedState): deserializesserializedState(previouslyserialized by theserializefunction) into an aggregation state that canbe passed into theserialize,aggregate,merge, orfinalizefunctions.
To learn more about the built-in JavaScript serialization functions, seeSerialization functions for a JavaScript UDAF.
To learn how to serialize and deserialize data with a JavaScript UDAF, seeSerialize and deserialize data in a JavaScript UDAF.
Permitted SQL type encodings in a JavaScript UDAF
In JavaScript UDAFs, the following supportedGoogleSQL data typesrepresentJavaScript data typesas follows:
| GoogleSQL data type | JavaScript data type | Notes |
|---|---|---|
ARRAY | Array | An array of arrays is not supported. To get around this limitation, use theArray<Object<Array>> (JavaScript) andARRAY<STRUCT<ARRAY>> (GoogleSQL) data types. |
BIGNUMERIC | Number orString | Same asNUMERIC. |
BOOL | Boolean | |
BYTES | Uint8Array | |
DATE | Date | |
FLOAT64 | Number | |
INT64 | BigInt | |
JSON | Various types | The GoogleSQLJSON data type can be converted into a JavaScriptObject,Array, or other GoogleSQL-supported JavaScript data type. |
NUMERIC | Number orString | If aNUMERIC value can be represented exactly as anIEEE 754 floating-point value (range[-253, 253]), and has no fractional part, it is encoded as aNumber data type, otherwise it is encoded as aString data type. |
STRING | String | |
STRUCT | Object | EachSTRUCT field is a named property in theObject data type. An unnamedSTRUCT field is not supported. |
TIMESTAMP | Date | Date contains a microsecond field with the microsecond fraction ofTIMESTAMP. |
Call a UDAF
This section describes the various ways that you can call a persistent ortemporary UDAF after you create it in BigQuery.
Note: JavaScript UDAF calls don't support theORDER BY clause.Call a persistent UDAF
You can call a persistent UDAF in the same way that you call abuilt-in aggregate function. For more information, seeAggregate function calls.You must include the dataset in the function path.
In the following example, the query calls a persistent UDAF that'scalledWeightedAverage:
SELECTmy_project.my_dataset.WeightedAverage(item,weight,2)ASweighted_averageFROM(SELECT1ASitem,2.45ASweightUNIONALLSELECT3ASitem,0.11ASweightUNIONALLSELECT5ASitem,7.02ASweight);
A table with the following results is produced:
/*------------------* | weighted_average | +------------------+ | 4.5 | *------------------*/Call a temporary UDAF
You can call a temporary UDAF in the same way that you call abuilt-in aggregate function. For more information, seeAggregate function calls.
The temporary function must be included in amulti-statement query orprocedure that contains the UDAFfunction call.
In the following example, the query calls a temporary UDAF that'scalledWeightedAverage:
CREATETEMPAGGREGATEFUNCTIONWeightedAverage(...)-- Temporary UDAF function callSELECTWeightedAverage(item,weight,2)ASweighted_averageFROM(SELECT1ASitem,2.45ASweightUNIONALLSELECT3ASitem,0.11ASweightUNIONALLSELECT5ASitem,7.02ASweight);
A table with the following results is produced:
/*------------------* | weighted_average | +------------------+ | 4.5 | *------------------*/Ignore or include rows withNULL values
When a JavaScript UDAF is called with theIGNORE NULLS argument,BigQuery automatically skips over rows for which any aggregateargument evaluates toNULL. Such rows are excluded from the aggregationcompletely and are not passed to the JavaScriptaggregate function. WhenRESPECT NULLS argument is provided,NULL filtration is disabled, and everyrow is passed to the JavaScript UDAF, regardless ofNULL values.
When neither theIGNORE NULLS norRESPECT NULLS argument is provided, thedefault argument isIGNORE NULLS.
The following example illustrates the defaultNULL behavior,theIGNORE NULLS behavior, and theRESPECT NULLS behavior:
CREATETEMPAGGREGATEFUNCTIONSumPositive(xFLOAT64)RETURNSFLOAT64LANGUAGEjsASr''' export function initialState() { return {sum: 0} } export function aggregate(state, x) { if (x == null) { // Use 1000 instead of 0 as placeholder for null so // that NULL values passed are visible in the result. state.sum += 1000; return; } if (x > 0) { state.sum += x; } } export function merge(state, partialState) { state.sum += partialState.sum; } export function finalize(state) { return state.sum; }''';-- Call the JavaScript UDAF.WITHnumbersAS(SELECT*FROMUNNEST([1.0,2.0,NULL])ASx)SELECTSumPositive(x)ASsum,SumPositive(xIGNORENULLS)ASsum_ignore_nulls,SumPositive(xRESPECTNULLS)ASsum_respect_nullsFROMnumbers;/*-----+------------------+-------------------* | sum | sum_ignore_nulls | sum_respect_nulls | +-----+------------------+-------------------+ | 3.0 | 3.0 | 1003.0 | *-----+------------------+-------------------*/
Delete a UDAF
This section describes the various ways that you can delete a persistent ortemporary UDAF after you created it in BigQuery.
Delete a persistent UDAF
To delete a persistent UDAF, use theDROP FUNCTION statement.You must include the dataset in the function path.
In the following example, the query deletes a persistent UDAF that'scalledWeightedAverage:
DROPFUNCTIONIFEXISTSmy_project.my_dataset.WeightedAverage;
Delete a temporary UDAF
To delete a temporary UDAF, use theDROP FUNCTION statement.
In the following example, the query deletes a temporary UDAF that'scalledWeightedAverage:
DROPFUNCTIONIFEXISTSWeightedAverage;
A temporary UDAF expires as soon as the query finishes. The UDAFdoesn't need to be deleted unless you want to remove it early from amulti-statement query orprocedure.
List UDAFs
UDAFs are a type of routine. To list all of the routines in a dataset,seeList routines.
Performance tips
If you want to improve the performance of your queries, consider the following:
Prefilter your input. Processing data in JavaScript is more expensive thanin SQL, so it is best to filter the input as much as possible in SQL first.
The following query is less efficient because it filters the input by using
x > 0in the UDAF call:SELECTJsFunc(x)FROMt;The following query is more efficient because it prefilters the inputby using
WHERE x > 0before the UDAF is called:SELECTJsFunc(x)FROMtWHEREx >0;Use built-in aggregate functions instead of JavaScript when possible.Re-implementing a built-in aggregate function in JavaScript is slower thancalling a built-in aggregate function that does the same thing.
The following query is less efficient because it implements a UDAF:
SELECTSumSquare(x)FROMt;The following query is more efficient because it implements a built-infunction that produces the same results as the previous query:
SELECTSUM(x*x)FROMt;JavaScript UDAFs are appropriate for more complex aggregation operations,which can't be expressed through built-in functions.
Use memory efficiently. The JavaScript processing environment has limitedmemory available for each query. JavaScript UDAF queries that accumulate toomuch local state might fail due to memory exhaustion. Be especially mindfulabout minimizing the size of aggregation state objects and avoid aggregationstates which accumulate large numbers of rows.
The following query is not efficient because the
aggregatefunctionuses an unbounded amount of memory when the number of rows processedgets large.exportfunctioninitialState(){return{rows:[]};}exportfunctionaggregate(state,x){state.rows.push(x);}...Use partitioned tables when possible. JavaScript UDAFs typically run moreefficiently when querying against a partitioned table compared to anon-partitioned table, because a partitioned table stores data in manysmaller files compared to a non-partitioned table, thus enabling higherparallelism.
Limitations
UDAFs have the same limitations that apply to UDFs. For details, seeUDF limitations.
Only literals, query parameters, and script variables can be passed in asnon-aggregate arguments for a UDAF.
The use of the
ORDER BYclause in a JavaScript UDAF function call isunsupported.SELECTMyUdaf(xORDERBYy)FROMt;-- Error: ORDER BY is unsupported.
Pricing
UDAFs are billed using the standardBigQuery pricing model.
Quotas and limits
UDAFs have the same quotas and limits that apply to UDFs. Forinformation about UDF quotas, seeQuotas and limits.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.