The ML.HASH_BUCKETIZE function

This document describes theML.HASH_BUCKETIZE function, which lets youconvert a string expression to a deterministic hash and then bucketize it by themodulo value of that hash.

You can use this function with models that supportmanual feature preprocessing. For moreinformation, see the following documents:

Syntax

ML.HASH_BUCKETIZE(string_expression, hash_bucket_size)

Arguments

ML.HASH_BUCKETIZE takes the following arguments:

  • string_expression: theSTRING expression to bucketize.
  • hash_bucket_size: anINT64 value that specifies the number of buckets tocreate. This value must be greater than or equal to0. Ifhash_bucket_size equals0, the function only hashes the string withoutbucketizing the hashed value.

Output

ML.HASH_BUCKETIZE returns anINT64 value that identifies the bucket.

Example

The following example bucketizes string expressions into three buckets:

SELECTf,ML.HASH_BUCKETIZE(f,3)ASbucketFROMUNNEST(['a','b','c','d'])ASf;

The output looks similar to the following:

+---+--------+| f | bucket |+---+--------+| a |   0    |+---+--------+| b |   1    |+---+--------+| c |   1    |+---+--------+| d |   2    |+------------+

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.