Train and use your own models

This page provides an overview of the workflow for training and using your ownmachine learning (ML) models on Vertex AI.Vertex AI offers a spectrum of training methods designed to meetyour needs, from fully automated to fully custom.

  • AutoML: Build high-quality models with minimal technical effortby leveraging Google's automated ML capabilities.
  • Vertex AI serverless training: Run your custom training code ina fully managed, on-demand environment without worrying about infrastructure.
  • Vertex AI training clusters: Run large-scale, high-performancetraining jobs on a dedicated cluster of accelerators reserved for yourexclusive use.
  • Ray on Vertex AI: Scale Python applications and ML workloads usingthe open-source Ray framework on a managed service.

For help on deciding which of these methods to use, seeChoose a training method.

AutoML

AutoML on Vertex AI lets you build a code-free ML model basedon the training data that you provide. AutoML can automate tasks likedata preparation, model selection, hyperparameter tuning, and deployment forvarious data types and prediction tasks, which can make ML more accessible fora wide range of users.

Types of models you can build using AutoML

The types of models you can build depend on the type of data that you have.Vertex AI offers AutoML solutions for the following data types andmodel objectives:

Data typeSupported objectives
Image dataClassification, object detection.
Tabular dataClassification/regression, forecasting.

To learn more about AutoML, seeAutoML training overview.

Run custom training code on Vertex AI

If AutoML doesn't address your needs, you can provide your own trainingcodeand run it on Vertex AI's managed infrastructure. This gives you full controland flexibility over your model's architecture and training logic, letting youuse any ML framework you choose.

Vertex AI provides two primary modes for running your customtraining code: a serverless, on-demand environment, or a dedicated,reserved cluster.

Vertex AI serverless training

Serverless training is a fully managed service that lets you run your customtraining application without provisioning or managing any infrastructure.You package your code in a container, define your machine specifications(including CPUs and GPUs), and submit it as aCustomJob.

Vertex AI handles the rest:

  • Provisioning the compute resources for the duration of your job.
  • Executing your training code.
  • Deleting the resources after the job completes.

This pay-per-use, on-demand model is ideal for experimentation, rapidprototyping, and for production jobs that don't require assured, instantaneouscapacity.

To learn more, seeCreate a serverless training custom job

Vertex AI training clusters

For large-scale, high-performance, and mission-critical training, you canreserve a dedicated cluster of accelerators. This provides assures capacity andeliminates queues, ensuring your jobs start immediately.

While you have exclusive use of these resources, Vertex AI stillhandles the operational overhead of managing the cluster, including hardwaremaintenance and OS patching. This "managed serverful" approach gives you thepower of a dedicated cluster without the management complexity.

Ray on Vertex AI

Ray on Vertex AI is a service that lets you use the open-sourceRay framework for scaling AI and Python applications directly within theVertex AI platform. Ray is designed to provide theinfrastructure for distributed computing and parallel processing for yourML workflow.

Ray on Vertex AI provides a managed environment for runningdistributed applications using the Ray framework, offering scalability andintegration with Google Cloud services.

To learn more about Ray on Vertex AI seeRay on Vertex AI overview.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.