Hello custom training: Train a custom image classification model

This page shows you how to run a TensorFlow Keras training application onVertex AI. This particular model trains an image classificationmodel that can classify flowers by type.

This tutorial has several pages:

  1. Setting up your project and environment.

  2. Training a custom image classification model.

  3. Serving predictions from a custom image classificationmodel.

  4. Cleaning up your project.

Each page assumes that you have already performed the instructions from theprevious pages of the tutorial.

The rest of this document assumes that you are using the same Cloud Shellenvironment that you created when following thefirst page of thistutorial. If your original Cloud Shell session is nolonger open, you can return to the environment by doing the following:

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

  2. In the Cloud Shell session, run the following command:

    cdhello-custom-sample

Run a custom training pipeline

This section describes using the training package that you uploaded toCloud Storage to run a Vertex AI custom trainingpipeline.

  1. In the Google Cloud console, in the Vertex AI section, go totheTraining pipelines page.

    Go to Training pipelines

  2. ClickCreate to open theTrain new model pane.

  3. On theChoose training method step, do the following:

    1. In theDataset drop-down list, selectNo managed dataset. Thisparticular training application loads data from theTensorFlowDatasets library rather than a managed Vertex AIdataset.

    2. Ensure thatCustom training (advanced) is selected.

    ClickContinue.

  4. On theModel details step, in theName field, enterhello_custom. ClickContinue.

  5. On theTraining container step, provide Vertex AI withinformation it needs to use the training package that you uploaded toCloud Storage:

    1. SelectPrebuilt container.

    2. In theModel framework drop-down list, selectTensorFlow.

    3. In theModel framework version drop-down list, select2.3.

    4. In thePackage location field, entercloud-samples-data/ai-platform/hello-custom/hello-custom-sample-v1.tar.gz.

    5. In thePython module field, entertrainer.task.trainer is thename of the Python package in your tarball, andtask.py contains yourtraining code. Therefore,trainer.task is the name of the module thatyou want Vertex AI to run.

    6. In theModel output directory field, clickBrowse. Do thefollowing in theSelect folder pane:

      1. Navigate to your Cloud Storage bucket.

      2. ClickCreate new foldercreate_new_folder.

      3. Name the new folderoutput. Then clickCreate.

      4. ClickSelect.

      Confirm that field has the valuegs://BUCKET_NAME/output, whereBUCKET_NAMEis the name of your Cloud Storage bucket.

      This value gets passed to Vertex AI in thebaseOutputDirectory APIfield, which setsseveral environment variables that your training application can accesswhen it runs.

      For example, when you set this field togs://BUCKET_NAME/output, Vertex AI setstheAIP_MODEL_DIR environment variable togs://BUCKET_NAME/output/model. At the end of training,Vertex AI uses any artifacts in theAIP_MODEL_DIR directoryto create a model resource.

      Learn more about theenvironment variables set by thisfield.

    ClickContinue.

  6. On the optionalHyperparameters step, make sure that theEnablehyperparameter tuning checkbox is cleared. This tutorial does not usehyperparameter tuning. ClickContinue.

  7. On theCompute and pricing step, allocate resources for the customtraining job:

    1. In theRegion drop-down list, selectus-central1 (Iowa).

    2. In theMachine type drop-down list, selectn1-standard-4 from theStandard section.

    Do not add any accelerators or worker pools for this tutorial. ClickContinue.

  8. On thePrediction container step, provide Vertex AI withinformation it needs to serve predictions:

    1. SelectPrebuilt container.

    2. In thePrebuilt container settings section, do the following:

      1. In theModel framework drop-down list, selectTensorFlow.

      2. In theModel framework version drop-down list, select2.3.

      3. In theAccelerator type drop-down list, selectNone.

      4. Confirm thatModel directory field has the valuegs://BUCKET_NAME/output, whereBUCKET_NAME is the name of your Cloud Storagebucket. This matches theModel output directory value that youprovided in a previous step.

    3. Leave the fields in thePredict schemata section blank.

  9. ClickStart training to start the custom training pipeline.

You can now view your newtraining pipeline, which is namedhello_custom, ontheTraining page. (You might need to refresh the page.) The trainingpipeline does two main things:

  1. The training pipeline creates acustom job resource namedhello_custom-custom-job. After a few moments, you can view this resourceon theCustom jobs page of theTraining section:

    Go to Custom jobs

    The custom job runs the training application using the computing resourcesthat you specified in this section.

  2. After the custom job completes, the training pipeline finds the artifactsthat your training application creates in theoutput/model/ directory ofyour Cloud Storage bucket. It uses these artifacts to createamodel resource.

Monitor training

To view training logs, do the following:

  1. In the Google Cloud console, in the Vertex AI section, go totheCustom jobs page.

    Go to Custom jobs

  2. To view details for theCustomJob that you just created, clickhello_custom-custom-job in the list.

  3. On the job details page, clickView logs.

View your trained model

When the custom training pipeline completes, you can find the trained model inthe Google Cloud console, in the Vertex AI section, on theModels page.

Go to Models

The model has the namehello_custom.

What's next

Follow thenext page of this tutorial to servepredictions from your trained ML model.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-18 UTC.