Hello custom training: Serve predictions from a custom image classification model

This page walks through serving predictions from your image classification modeland viewing these predictions in a web app.

This tutorial has several pages:

  1. Setting up your project and environment.

  2. Training a custom image classificationmodel.

  3. Serving predictions from a custom imageclassification model.

  4. Cleaning up your project.

Each page assumes that you have already performed the instructions from theprevious pages of the tutorial.

The rest of this document assumes that you are using the same Cloud Shellenvironment that you created when following thefirst page of thistutorial. If your original Cloud Shell session is nolonger open, you can return to the environment by doing the following:

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

  2. In the Cloud Shell session, run the following command:

    cdhello-custom-sample

Create an endpoint

To get online predictions from the ML model that you trained when followingthe previous page of this tutorial, create a Vertex AIendpoint.Endpoints serve online predictions from one or more models.

  1. In the Google Cloud console, in the Vertex AI section, go totheModels page.

    Go to Models

  2. Find the row of the model that you trained in theprevious step of thistutorial,hello_custom, and click the model'sname to open the model detail page.

  3. On theDeploy & test tab, clickDeploy to endpoint to open theDeploy to endpoint pane.

  4. On theDefine your endpoint step, add some basic information for yourendpoint:

    1. SelectCreate new endpoint.

    2. In theEndpoint name field, enterhello_custom.

    3. In theModel settings section, ensure that you see the name of yourmodel, which is also calledhello_custom. Specify the following modelsettings:

      1. In theTraffic split field, enter100. Vertex AIsupports splitting traffic for an endpoint to multiple models, butthis tutorial doesn't use that feature.

      2. In theMinimum number of compute nodes field, enter1.

      3. In theMachine type drop-down list, selectn1-standard-2 fromtheStandard section.

      4. ClickDone.

    4. In theLogging section, ensure that both types of prediction loggingare enabled.

    ClickContinue.

  5. On theEndpoint details step, confirm that your endpoint will be deployedtous-central1 (Iowa).

    Do not select theUse a customer-managed encryption key (CMEK) checkbox.This tutorial does not useCMEK.

  6. ClickDeploy to create the endpoint and deploy your model to theendpoint.

After a few minutes,

Deploy a Cloud Run function

You can get predictions from the Vertex AI endpoint that you justcreated by sending requests to the Vertex AI API's REST interface. However, onlyprincipals with theaiplatform.endpoints.predictpermission can send online prediction requests. Youcannot make the endpoint public for anybody to send requests to, for example viaa web app.

In this section, deploy code toCloud Run functions to handleunauthenticated requests. The sample code that you downloaded when you read thefirst page of this tutorial contains code for thisCloud Run function in thefunction/ directory. Optionally, run thefollowing command to explore the Cloud Run function code:

lessfunction/main.py

Deploying the function serves the following purposes:

  • Youcan configure a Cloud Run function to receive unauthenticatedrequests. Additionally, functions run usinga service account with the Editorrole by default, which includestheaiplatform.endpoints.predict permission necessary to get predictionsfrom your Vertex AI endpoint.

  • This function also performs useful preprocessing on requests. TheVertex AI endpoint expects prediction requests in the formatof the trained TensorFlow Keras graph's first layer: a tensor of normalizedfloats with fixed dimensions. The function takes the URL of an image as inputand preprocesses the image into this format before requesting a predictionfrom the Vertex AI endpoint.

To deploy the Cloud Run function, do the following:

  1. In the Google Cloud console, in the Vertex AI section, go totheEndpoints page.

    Go to Endpoints

  2. Find the row of the endpoint that you created in the previous section, namedhello_custom. In this row, clickSample request to open theSample request pane.

  3. In theSample request pane, find the line of shell code that matches thefollowing pattern:

    ENDPOINT_ID="ENDPOINT_ID"

    ENDPOINT_ID is a number that identifies this particular endpoint.

    Copy this line of code, and run it in your Cloud Shell session toset theENDPOINT_ID variable.

  4. Run the following command in your Cloud Shell session to deploy theCloud Run function:

    gcloudfunctionsdeployclassify_flower\--region=us-central1\--source=function\--runtime=python37\--memory=2048MB\--trigger-http\--allow-unauthenticated\--set-env-vars=ENDPOINT_ID=${ENDPOINT_ID}

Deploy a web app to send prediction requests

Finally, host a static web app on Cloud Storage to get predictionsfrom your trained ML model. The web app sends requests to yourCloud Run function, which preprocesses them and gets predictions from theVertex AI endpoint.

Thewebapp directory of the sample code that you downloaded contains a sampleweb app. In your Cloud Shell session, run the following commandsto prepare and deploy the web app:

  1. Set a couple of shell variables for commands in following steps to use:

    PROJECT_ID=PROJECT_IDBUCKET_NAME=BUCKET_NAME

    Replace the following:

  2. Edit the app to provide it with the trigger URL of yourCloud Run function:

    echo"export const CLOUD_FUNCTION_URL = 'https://us-central1-${PROJECT_ID}.cloudfunctions.net/classify_flower';"\  >webapp/function-url.js
  3. Upload thewebapp directory to your Cloud Storage bucket:

    gcloudstoragecpwebappgs://${BUCKET_NAME}/--recursive
  4. Make the web app files that you just uploadedpubliclyreadable:

    gcloudstorageobjectsupdategs://${BUCKET_NAME}/webapp/**--add-acl-grant=entity=allUsers,role=READER
    Note: Shells (like bash, zsh) sometimes attempt to expand wildcards in waysthat can be surprising. For more details, seeURI wildcards.
  5. You can now navigate to the following URL to open web app and getpredictions:

    https://storage.googleapis.com/BUCKET_NAME/webapp/index.html

    Open the web app and click an image of a flower to see your ML model'sclassification of the flower type. The web app presents the prediction as alist of flower types and the probability that the image contains each type offlower.

    Note: This web app gets predictions for images that were also included in thetraining dataset for the model. Therefore the model might appear moreaccurate than it actually is due tooverfitting.

In the following screenshot, the web app has already gotten one prediction and is in the process of sending another prediction request.

Web app with four labeled images of flowers. One has probabilities of      predicted labels underneath it. Another has a loading bar underneath it.

What's next

Follow thelast page of the tutorial to clean upresources that you have created.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.