Feature engineering Stay organized with collections Save and categorize content based on your preferences.
Preview
This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.
This document describes how Feature Transform Engine performs featureengineering. Feature Transform Engine performs feature selection and feature transformations.If feature selection is enabled, Feature Transform Engine creates a ranked set of importantfeatures. If feature transformations are enabled, Feature Transform Engineprocesses the features to ensure that the input for model training and modelserving is consistent. Feature Transform Engine can be used on its own or together with any ofthetabular training workflows.It supports both TensorFlow and non-TensorFlow frameworks.
Inputs
Provide the following inputs to Feature Transform Engine:
- Raw data (BigQuery or CSV dataset).
- Data split configuration.
- Feature selection configuration.
- Feature transformation configuration.
Outputs
Feature Transform Engine generates the following outputs:
dataset_stats: Statistics that describe the raw dataset. For example,dataset_statsgives the number of rows in the dataset.feature_importance: The importance score of the features. This output isgenerated iffeature selection is enabled.materialized_data, which is the transformed version of a data split groupcontaining the training split, the evaluation split, and the test split.training_schema: Training data schema in OpenAPI specification, whichdescribes the data types of the training data.instance_schema: Instance schema in OpenAPI specification, which describes thedata types of the inference data.transform_output: Metadata of the transformation. If you useTensorFlow for transformation, the metadata includes theTensorFlow graph.
Processing steps
Feature Transform Engine performs the following steps:
- Generatedataset splits for training, evaluation, and testing.
- Generate input dataset statistics
dataset_statsthat describe the raw dataset. - Performfeature selection.
- Process the transform configuration using the dataset statistics, resolvingautomatic transformation parameters into manual transformation parameters.
- Transform raw features into engineered features.Different transformations are done for different types of features.
Feature selection
The main purpose of feature selection is to reduce the number of features usedin the model. The reduced feature set captures most of the label'sinformation in a more compact manner. Feature selection allows you to reduce thecost of training and serving models without significantly impacting model quality.
If you enable feature selection, Feature Transform Engine assigns an importancescore to each feature. You can choose to output the importance scores of thefull set of features or of a reduced subset of the most important features.
Vertex AI offers the following feature selection algorithms:
- Adjusted Mutual Information (AMI)
- Conditional Mutual Information Maximization (CMIM)
- Joint Mutual Information Maximization (JMIM)
- Maximum Relevance Minimum Redundancy (MRMR)
Note that no feature selection algorithm always works best on alldatasets and for all purposes. If possible, run all the algorithms and combinethe results.
Adjusted Mutual Information (AMI)
AMI is an adjustment of the Mutual Information (MI) score to account for chance.It accounts for the fact that the MI is generally higher for two clusteringswith a larger number of clusters, regardless of whether there is actually moreinformation shared.
AMI is good at detecting the relevance of features and the label, but it isinsensitive to feature redundancy. Consider AMI if there are manyfeatures (for example, more than 2000) and not much feature redundancy. It isfaster than the other algorithms described here, but it could pick up redundantfeatures.
Conditional Mutual Information Maximization (CMIM)
CMIM is a greedy algorithm that chooses features iteratively based on conditional mutual information of candidate features with respect to selected features. In each iteration, it selects the feature that maximizes the minimum mutual information with the label that hasn't been captured by selected features yet.
CMIM is robust in dealing with feature redundancy, and it works well in typical cases.
Joint Mutual Information Maximization (JMIM)
JMIM is a greedy algorithm that is similar to CMIM. JMIM selects the feature thatmaximizes the joint mutual information of the new one and pre-selected featureswith the label, while CMIM takes redundancy more into account.
JMIM is a high-quality feature selection algorithm.
Maximum Relevance Minimum Redundancy (MRMR)
MRMR is a greedy algorithm that works iteratively. It is similar to CMIM. Eachiteration chooses the feature that maximizes relevance with respect to the labelwhile minimizing pair-wise redundancy with respect to the selected features inprevious iterations.
MRMR is a high-quality feature selection algorithm.
What's next
After performing feature engineering, you can train a model for classificationor regression:
- Train a model withEnd-to-End AutoML.
- Train a model withTabNet.
- Train a model withWide & Deep.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-18 UTC.