Feature preprocessing overview

Feature preprocessing is one of the most important steps in the machinelearning lifecycle. It consists of creating features and cleaning the trainingdata. Creating features is also referred asfeature engineering.

BigQuery ML provides the following feature preprocessing techniques:

  • Automatic preprocessing. BigQuery ML performs automaticpreprocessing during training. For more information, seeAutomatic featurepreprocessing.

  • Manual preprocessing. You can use theTRANSFORM clausein theCREATE MODEL statement to define custom preprocessing usingmanualpreprocessingfunctions.You can also use these functions outside of theTRANSFORM clause toprocess training data before creating the model.

Get feature information

You can use theML.FEATURE_INFOfunction toretrieve the statistics of all input feature columns.

Recommended knowledge

By using the default settings in theCREATE MODEL statements and theinference functions, you can create and use BigQuery ML modelseven without much ML knowledge. However, having basic knowledge about theML development lifecycle, such as feature engineering and model training,helps you optimize both your data and your model todeliver better results. We recommend using the following resources to developfamiliarity with ML techniques and processes:

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.