Feature preprocessing overview
Feature preprocessing is one of the most important steps in the machinelearning lifecycle. It consists of creating features and cleaning the trainingdata. Creating features is also referred asfeature engineering.
BigQuery ML provides the following feature preprocessing techniques:
Automatic preprocessing. BigQuery ML performs automaticpreprocessing during training. For more information, seeAutomatic featurepreprocessing.
Manual preprocessing. You can use the
TRANSFORMclausein theCREATE MODELstatement to define custom preprocessing usingmanualpreprocessingfunctions.You can also use these functions outside of theTRANSFORMclause toprocess training data before creating the model.
Get feature information
You can use theML.FEATURE_INFOfunction toretrieve the statistics of all input feature columns.
Recommended knowledge
By using the default settings in theCREATE MODEL statements and theinference functions, you can create and use BigQuery ML modelseven without much ML knowledge. However, having basic knowledge about theML development lifecycle, such as feature engineering and model training,helps you optimize both your data and your model todeliver better results. We recommend using the following resources to developfamiliarity with ML techniques and processes:
- Machine Learning Crash Course
- Intro to Machine Learning
- Data Cleaning
- Feature Engineering
- Intermediate Machine Learning
What's next
- Learn aboutfeature serving inBigQuery ML.
For more information about supported SQL statements and functions for modelsthat support feature preprocessing, see the following documents:
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-19 UTC.