HyperLogLog++ functions

TheHyperLogLog++ algorithm (HLL++) estimatescardinality from sketches.

HLL++ functions are approximate aggregate functions.Approximate aggregation typically requires lessmemory than exact aggregation functions,likeCOUNT(DISTINCT), but also introduces statistical error.This makes HLL++ functions appropriate for large data streams forwhich linear memory usage is impractical, as well as for data that isalready approximate.

A data sketch is a compact summary of a data aggregation. If you don't needmaterialized sketches, you can alternatively use anapproximate aggregate function with system-defined precision,such asAPPROX_COUNT_DISTINCT. However,APPROX_COUNT_DISTINCT doesn't allow partial aggregations, re-aggregations,and custom precision.

GoogleSQL for BigQuery supports the following HLL++ functions.

Function list

NameSummary
HLL_COUNT.EXTRACT Extracts a cardinality estimate of an HLL++ sketch.
HLL_COUNT.INIT Aggregates values of the same underlying type into a new HLL++ sketch.
HLL_COUNT.MERGE Merges HLL++ sketches of the same underlying type into a new sketch, and then gets the cardinality of the new sketch.
HLL_COUNT.MERGE_PARTIAL Merges HLL++ sketches of the same underlying type into a new sketch.

HLL_COUNT.EXTRACT

HLL_COUNT.EXTRACT(sketch)

Description

A scalar function that extracts a cardinality estimate of a singleHLL++ sketch.

Ifsketch isNULL, this function returns a cardinality estimate of0.

Supported input types

BYTES

Return type

INT64

Example

The following query returns the number of distinct users for each country whohave at least one invoice.

SELECTcountry,HLL_COUNT.EXTRACT(HLL_sketch)ASdistinct_customers_with_open_invoiceFROM(SELECTcountry,HLL_COUNT.INIT(customer_id)AShll_sketchFROMUNNEST(ARRAY<STRUCT<countrySTRING,customer_idSTRING,invoice_idSTRING>>[('UA','customer_id_1','invoice_id_11'),('BR','customer_id_3','invoice_id_31'),('CZ','customer_id_2','invoice_id_22'),('CZ','customer_id_2','invoice_id_23'),('BR','customer_id_3','invoice_id_31'),('UA','customer_id_2','invoice_id_24')])GROUPBYcountry);/*---------+--------------------------------------+ | country | distinct_customers_with_open_invoice | +---------+--------------------------------------+ | UA      |                                    2 | | BR      |                                    1 | | CZ      |                                    1 | +---------+--------------------------------------*/

HLL_COUNT.INIT

HLL_COUNT.INIT(input[,precision])

Description

An aggregate function that takes one or moreinput values and aggregates theminto aHLL++ sketch. Each sketchis represented using theBYTES data type. You can then merge sketches usingHLL_COUNT.MERGE orHLL_COUNT.MERGE_PARTIAL. If no merging is needed,you can extract the final count of distinct values from the sketch usingHLL_COUNT.EXTRACT.

This function supports an optional parameter,precision. This parameterdefines the accuracy of the estimate at the cost of additional memory requiredto process the sketches or store them on disk. The range for this value is10 to24. The default value is15. For more information about precision,seePrecision for sketches.

If the input isNULL, this function returnsNULL.

For more information, seeHyperLogLog in Practice: Algorithmic Engineering ofa State of The Art Cardinality Estimation Algorithm.

Supported input types

  • INT64
  • NUMERIC
  • BIGNUMERIC
  • STRING
  • BYTES

Return type

BYTES

Example

The following query creates HLL++ sketches that count the number of distinctusers with at least one invoice per country.

SELECTcountry,HLL_COUNT.INIT(customer_id,10)AShll_sketchFROMUNNEST(ARRAY<STRUCT<countrySTRING,customer_idSTRING,invoice_idSTRING>>[('UA','customer_id_1','invoice_id_11'),('CZ','customer_id_2','invoice_id_22'),('CZ','customer_id_2','invoice_id_23'),('BR','customer_id_3','invoice_id_31'),('UA','customer_id_2','invoice_id_24')])GROUPBYcountry;/*---------+------------------------------------------------------------------------------------+ | country | hll_sketch                                                                         | +---------+------------------------------------------------------------------------------------+ | UA      | "\010p\020\002\030\002 \013\202\007\r\020\002\030\n \0172\005\371\344\001\315\010" | | CZ      | "\010p\020\002\030\002 \013\202\007\013\020\001\030\n \0172\003\371\344\001"       | | BR      | "\010p\020\001\030\002 \013\202\007\013\020\001\030\n \0172\003\202\341\001"       | +---------+------------------------------------------------------------------------------------*/

HLL_COUNT.MERGE

HLL_COUNT.MERGE(sketch)

Description

An aggregate function that returns the cardinality of severalHLL++ sketches by computing their union.

Eachsketch must be initialized on the same type. Attempts to merge sketchesfor different types results in an error. For example, you can't merge a sketchinitialized fromINT64 data with one initialized fromSTRING data.

If the merged sketches were initialized with different precisions, the precisionwill be downgraded to the lowest precision involved in the merge.

This function ignoresNULL values when merging sketches. If the merge happensover zero rows or only overNULL values, the function returns0.

Supported input types

BYTES

Return type

INT64

Example

The following query counts the number of distinct users across all countries who have at least one invoice.

SELECTHLL_COUNT.MERGE(hll_sketch)ASdistinct_customers_with_open_invoiceFROM(SELECTcountry,HLL_COUNT.INIT(customer_id)AShll_sketchFROMUNNEST(ARRAY<STRUCT<countrySTRING,customer_idSTRING,invoice_idSTRING>>[('UA','customer_id_1','invoice_id_11'),('BR','customer_id_3','invoice_id_31'),('CZ','customer_id_2','invoice_id_22'),('CZ','customer_id_2','invoice_id_23'),('BR','customer_id_3','invoice_id_31'),('UA','customer_id_2','invoice_id_24')])GROUPBYcountry);/*--------------------------------------+ | distinct_customers_with_open_invoice | +--------------------------------------+ |                                    3 | +--------------------------------------*/

HLL_COUNT.MERGE_PARTIAL

HLL_COUNT.MERGE_PARTIAL(sketch)

Description

An aggregate function that takes one or moreHLL++sketchinputs and merges them into a new sketch.

Eachsketch must be initialized on the same type. Attempts to merge sketchesfor different types results in an error. For example, you can't merge a sketchinitialized fromINT64 data with one initialized fromSTRING data.

If the merged sketches were initialized with different precisions, the precisionwill be downgraded to the lowest precision involved in the merge. For example,ifMERGE_PARTIAL encounters sketches of precision 14 and 15, the returned newsketch will have precision 14.

This function returnsNULL if there is no input or all inputs areNULL.

Supported input types

BYTES

Return type

BYTES

Example

The following query returns an HLL++ sketch that counts the number of distinctusers who have at least one invoice across all countries.

SELECTHLL_COUNT.MERGE_PARTIAL(HLL_sketch)ASdistinct_customers_with_open_invoiceFROM(SELECTcountry,HLL_COUNT.INIT(customer_id)AShll_sketchFROMUNNEST(ARRAY<STRUCT<countrySTRING,customer_idSTRING,invoice_idSTRING>>[('UA','customer_id_1','invoice_id_11'),('BR','customer_id_3','invoice_id_31'),('CZ','customer_id_2','invoice_id_22'),('CZ','customer_id_2','invoice_id_23'),('BR','customer_id_3','invoice_id_31'),('UA','customer_id_2','invoice_id_24')])GROUPBYcountry);/*----------------------------------------------------------------------------------------------+ | distinct_customers_with_open_invoice                                                         | +----------------------------------------------------------------------------------------------+ | "\010p\020\006\030\002 \013\202\007\020\020\003\030\017 \0242\010\320\2408\352}\244\223\002" | +----------------------------------------------------------------------------------------------*/

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.