The ML.HASH_BUCKETIZE function
This document describes theML.HASH_BUCKETIZE function, which lets youconvert a string expression to a deterministic hash and then bucketize it by themodulo value of that hash.
You can use this function with models that supportmanual feature preprocessing. For moreinformation, see the following documents:
Syntax
ML.HASH_BUCKETIZE(string_expression, hash_bucket_size)
Arguments
ML.HASH_BUCKETIZE takes the following arguments:
string_expression: theSTRINGexpression to bucketize.hash_bucket_size: anINT64value that specifies the number of buckets tocreate. This value must be greater than or equal to0. Ifhash_bucket_sizeequals0, the function only hashes the string withoutbucketizing the hashed value.
Output
ML.HASH_BUCKETIZE returns anINT64 value that identifies the bucket.
Example
The following example bucketizes string expressions into three buckets:
SELECTf,ML.HASH_BUCKETIZE(f,3)ASbucketFROMUNNEST(['a','b','c','d'])ASf;
The output looks similar to the following:
+---+--------+| f | bucket |+---+--------+| a | 0 |+---+--------+| b | 1 |+---+--------+| c | 1 |+---+--------+| d | 2 |+------------+
What's next
- For information about feature preprocessing, seeFeature preprocessing overview.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.