Overview of creating managed datasets on Vertex AI Stay organized with collections Save and categorize content based on your preferences.
You can use a managed dataset to provide the source data usedto train AutoML and custom models on Vertex AI. A manageddataset is required for AutoML and is optional forcustom training.
Permissions and access control
When you use data from a Cloud Storage bucket to create a dataset, Vertex AI requires permissions to access the data. Vertex AI uses a special Google-managed service account known as a Service Agent to securely access your data. For more information on the roles required and how the Service Agent works, seeAccess control with IAM.
Create a managed dataset for AutoML models
You can create managed datasets for training AutoML models by using theGoogle Cloud console or the Vertex AI API. The instructions for how to do thisslightly vary based on your data type and model objective. Start by preparingyour training data.
Image
Learn how to create a managed dataset for the following types of imageAutoML models:
Tabular
Learn how to create a managed dataset for the following types of tabularAutoML models:
Create a managed dataset for custom trained models
The instructions on how to create a managed dataset for training custom modelsare the same, regardless of your data type or model objective.
For details, seeUse managed datasets.
View managed datasets using Dataplex Universal Catalog
Dataplex Universal Catalog is a fully managed, scalable metadatamanagement service that provides a centralized location to search for datasetsacross projects and regions. It's integrated with Vertex AI and offerssimilar capabilities to the deprecated Data Catalog.
You can use Dataplex Universal Catalog to discover, understand,and enrich your data with aspects (which are similar to Data Catalogtags).
For details on managing metadata and aspects for your Vertex AIresources, seeManage aspects and enrich metadatain theDataplex Universal Catalog.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2026-02-18 UTC.