Use online inference logging Stay organized with collections Save and categorize content based on your preferences.
For AutoML tabular models, AutoML image models,and custom-trained models, you can enable or disable inference logs duringmodel deployment or endpointcreation. This page explains the different types of inferencelogs available, and how to enable or disable these logs.
Types of inference logs
There are several types of inference logs that you can use to get informationfrom your inference nodes:
Container logging, which logs the
stdoutandstderrstreams from yourinference nodes toCloud Logging. These logsare required for debugging.On the
v1service endpoint, container logging is enabled by default.You can disable it when you deploy a model. You can also disable orenable logging when youmutatethe deployed model.On the
v1beta1service endpoint, container logging is disabled bydefault. You can enable it when you deploy a model. You can also disableor enable logging when youmutatethe deployed model.
stderr, whichwill appear at theERRORlevel in Cloud Logging. If you'd like forcontainer logs to appear at theINFOlevel, configure your containerlogging to send outputs tostdout. For more information, see the PythonLogging handlerstutorials and the PythonLogging Cookbook.Access logging, which logs information like timestamp and latency foreach request to Cloud Logging.
On both the
v1andv1beta1service endpoints, access logging is disabledby default. You can enable access logging when you deploy a model to anendpoint.Request-response logging, which logs a sample of online inferencerequests and responses to a BigQuery table.
You can enable request-response logging by creating or patching theinference endpoint.
You can enable or disable each type of log independently.
Inference log settings
You can enable or disable online inference logs when you create an endpoint,deploy a model to the endpoint, or mutate a deployed model.
To update the settings for access logs, you mustundeploy your model,and then redeploy the model with your new settings. You can update the settingsfor container logs without re-deploying your model.
Online inference at a high rate of queries per second (QPS) can produce asubstantial number of logs, which are subject toCloud Logging pricing. To estimate the pricing foryour online inference logs, seeEstimating your bills for logging. To reducethis cost, you can disable inference logging.
Enable and disable inference logs
The following examples highlight where to modify the default log settings:
Console
When you deploy a model to an endpoint or create a new endpoint in theGoogle Cloud console, you can specify which types of inference logs to enablein theLogging step. Select the checkboxes to enableAccess logging orContainer logging, or clear the checkboxes to disable these logs.
Use the REST API to update the settings for container logs.
Use the REST API to enable request-response logging. TheGoogle Cloud console and gcloud CLI don't support request-responselogging configuration.
To see more context about how to deploy models, readDeploy a model using the Google Cloud console.
gcloud
To change the default behavior for which logs are enabled indeployed models, add flags to yourgcloud command:
v1 service endpoint
Rungcloud ai endpoints deploy-model:
gcloudaiendpointsdeploy-modelENDPOINT_ID\--region=LOCATION\--model=MODEL_ID\--display-name=DEPLOYED_MODEL_NAME\--machine-type=MACHINE_TYPE\--accelerator=count=2,type=nvidia-tesla-t4\--disable-container-logging\--enable-access-loggingv1beta1 service endpoint
Rungcloud beta ai endpoints deploy-model:
gcloudbetaaiendpointsdeploy-modelENDPOINT_ID\--region=LOCATION\--model=MODEL_ID\--display-name=DEPLOYED_MODEL_NAME\--machine-type=MACHINE_TYPE\--accelerator=count=2,type=nvidia-tesla-t4\--enable-access-logging\--enable-container-loggingUse the REST API to update the settings for container logs.
Use the REST API to enable request-response logging. TheGoogle Cloud console and gcloud CLI don't support request-responselogging configuration.
To see more context about how to deploy models, readDeploy a model using the Vertex AI API.
REST
To change the default behavior for which logs are enabled indeployed models, set the relevant fields toTrue:
v1 service endpoint
To disablecontainer logging, set thedisableContainerLogging field toTrue when you call eitherprojects.locations.endpoints.deployModel orprojects.locations.endpoints.mutateDeployedModel.
To enableaccess logging, setenableAccessLogging toTruewhen deploying your model withprojects.locations.endpoints.deployModel.
v1beta1 service endpoint
To enablecontainer logging, set theenableContainerLogging field toTrue when you call eitherprojects.locations.endpoints.deployModel orprojects.locations.endpoints.mutateDeployedModel.
To enableaccess logging, setenableAccessLogging toTruewhen deploying your model withprojects.locations.endpoints.deployModel.
To see more context about how to deploy models, readDeploy a model using the Vertex AI API.
Request-response logging
You can only enable request-response logging when you create anendpoint usingprojects.locations.endpoints.create or patch an existingendpoint usingprojects.locations.endpoints.patch.
Request-response logging is done at the endpoint level, so requestssent to any deployed models under the same endpoint are logged.
When you create or patch an endpoint, populate thepredictRequestResponseLoggingConfig field of theEndpoint resourcewith the following entries:
enabled: set toTrueto enable request-response logging.samplingPercentage: a number between 0 or 1 defining the fraction ofrequests to log. For example, set this value to1in order to log allrequests or to0.1to log 10% of requests.BigQueryDestination: the BigQuery table to be usedfor logging. If you only specify a project name, a new dataset is created withthe namelogging_ENDPOINT_DISPLAY_NAME_ENDPOINT_ID,whereENDPOINT_DISPLAY_NAMEfollows theBigQuery naming rules. If you don't specify a tablename, a new table is created with the namerequest_response_logging.The schema for the BigQuery table should look like the following:
Field name Type Mode endpointSTRING NULLABLE deployed_model_idSTRING NULLABLE logging_timeTIMESTAMP NULLABLE request_idNUMERIC NULLABLE request_payloadSTRING REPEATED response_payloadSTRING REPEATED
The following is an example configuration:
{ "predict_request_response_logging_config": { "enabled": true, "sampling_rate": 0.5, "bigquery_destination": { "output_uri": "bq://PROJECT_ID.DATASET_NAME.TABLE_NAME" } }}Inference request-response logging for dedicated endpoints and Private Service Connect endpoints
Fordedicated endpointsandPrivate Service Connectendpoints, you can use request-response logging to record requestand response payloads under 10 MB (larger payloads are skipped automatically)for TensorFlow, PyTorch, sklearn, and XGBoost models.
Request-response logging is available only for thepredictandrawPredictmethods.
To enable request-response logging, populate thepredictRequestResponseLoggingConfigfield of theEndpoint resource with the following entries:
enabled: set toTrueto enable request-response logging.samplingRate: the fraction of requests and responses to log. Set to a numberthat is greater than 0 and less than or equal to 1. For example, set thisvalue to1in order to log all requests or to0.1to log 10% of requests.BigQueryDestination: the BigQuery location for theoutput content, as a URI to a project or table.
The following is an example configuration for creating a dedicated endpoint withrequest-response logging enabled:
curl-XPOST\-H"Content-Type: application/json"\-H"Authorization: Bearer `gcloud auth print-access-token`"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints\-d'{displayName: "ENDPOINT_NAME", \ dedicatedEndpointEnabled: true, \ predictRequestResponseLoggingConfig: { \ enabled: true, \ samplingRate: 1.0, \ bigqueryDestination: { \ outputUri:"bq://PROJECT_ID" \ } \ } \ }'Replace the following:
- LOCATION_ID: The region where you are using Vertex AI.
- PROJECT_NUMBER: The project number for your Google Cloudproject.
- ENDPOINT_NAME: The display name for the endpoint.
- PROJECT_ID: The project ID for your Google Cloud project.
The following is an example configuration for creating aPrivate Service Connect endpoint with request-response logging enabled:
curl-XPOST\-H"Content-Type: application/json"\-H"Authorization: Bearer `gcloud auth print-access-token`"https://LOCATION_ID-aiplatform.googleapis.com/v1/projects/PROJECT_NUMBER/locations/LOCATION_ID/endpoints\-d'{displayName: "ENDPOINT_NAME", \ privateServiceConnectConfig: { \ enablePrivateServiceConnect: true, \ projectAllowlist: ["ALLOWED_PROJECTS"] \ }, \ predictRequestResponseLoggingConfig: { \ enabled: true, \ samplingRate: 1.0, \ bigqueryDestination: { \ outputUri:"bq://PROJECT_ID" \ } \ } \ }'Replace the following:
ALLOWED_PROJECTS: a comma-separated list of Google Cloudproject IDs, each enclosed in quotation marks. For example,["PROJECTID1", "PROJECTID2"].If a project isn't contained in this list,you won't be able to send inference requests to theVertex AI endpoint from it. Make sure to includeVERTEX_AI_PROJECT_ID in this list so that you can call theendpoint from the same project it's in.
Request-response logging and Model Monitoring v1
Request-response logging andModel Monitoring v1 usethe same BigQuery table on the backend to log incoming requests.To prevent unexpected changes to this BigQuery table, thefollowing limitations are enforced when using both features at the same time:
If an endpoint has Model Monitoring enabled, you can'tenable request-response logging for the same endpoint.
If you enable request-response logging and thenModel Monitoring on the same endpoint, you won't be ableto change the request-response logging configuration.
What's next
- Estimate pricing for online inferencelogging.
- Deploy a modelusing the Google Cloud console orusing the Vertex AI API.
- Learnhow to create a BigQuery table.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.