gcloud container ai profiles benchmarks list Stay organized with collections Save and categorize content based on your preferences.
- NAME
- gcloud container ai profiles benchmarks list - list benchmarks for a given model and model server
- SYNOPSIS
gcloud container ai profiles benchmarks list--model=MODEL--model-server=MODEL_SERVER[--format=FORMAT][--instance-type=INSTANCE_TYPE][--model-server-version=MODEL_SERVER_VERSION][--pricing-model=PRICING_MODEL][--serving-stack=SERVING_STACK][--serving-stack-version=SERVING_STACK_VERSION][--use-case=USE_CASE][--filter=EXPRESSION][--limit=LIMIT][--page-size=PAGE_SIZE][--sort-by=[FIELD,…]][--uri][GCLOUD_WIDE_FLAG …]
- DESCRIPTION
- This command lists all benchmarking data for a given model and model server. Bydefault, the benchmarks are displayed in a CSV format.
For examples of visualizing the benchmarking data, see the accompanying Colabnotebook:https://colab.research.google.com/github/GoogleCloudPlatform/kubernetes-engine-samples/blob/main/ai-ml/notebooks/giq_visualizations.ipynb
- REQUIRED FLAGS
--model=MODEL- The model.
--model-server=MODEL_SERVER- The model server.
- FLAGS
--format=FORMAT- The format to print the output in. Default is csvprofile, which displays theprofile information in a CSV format, includingcost conversions.
--instance-type=INSTANCE_TYPE- The instance type. If not specified, this defaults to anyinstance type.
--model-server-version=MODEL_SERVER_VERSION- The model server version. Default is latest. Other options include the modelserver version of a profile, all which returns all versions.
--pricing-model=PRICING_MODEL- The pricing model to use to calculate token cost. Currently, this supportson-demand, spot, 3-years-cud, 1-year-cud
--serving-stack=SERVING_STACK- The serving stack to filter benchmarking data by. If not provided, benchmarkingdata for all serving stacks that support the given model and model server willbe returned.
--serving-stack-version=SERVING_STACK_VERSION- The serving stack version to filter benchmarking data by. If not provided,benchmarking data for all versions that support the given model and model serverwill be returned.
--use-case=USE_CASE- If specified, results will only show profiles that match the provided use case.Options are: Advanced Customer Support, Code Completion, Text Summarization,Chatbot (ShareGPT), Code Generation, Deep Research.
- LIST COMMAND FLAGS
--filter=EXPRESSION- Apply a Boolean filter
EXPRESSIONto each resource itemto be listed. If the expression evaluatesTrue, then that item islisted. For more details and examples of filter expressions, run $gcloud topic filters. This flaginteracts with other flags that are applied in this order:--flatten,--sort-by,--filter,--limit. --limit=LIMIT- Maximum number of resources to list. The default is
unlimited. Thisflag interacts with other flags that are applied in this order:--flatten,--sort-by,--filter,--limit. --page-size=PAGE_SIZE- Some services group resource list output into pages. This flag specifies themaximum number of resources per page. The default is determined by the serviceif it supports paging, otherwise it is
unlimited(no paging).Paging may be applied before or after--filterand--limitdepending on the service. --sort-by=[FIELD,…]- Comma-separated list of resource field key names to sort by. The default orderis ascending. Prefix a field with ``~´´ for descending order on thatfield. This flag interacts with other flags that are applied in this order:
--flatten,--sort-by,--filter,--limit. --uri- Print a list of resource URIs instead of the default output, and change thecommand output to a list of URIs. If this flag is used with
--format, the formatting is applied on this URI list. To displayURIs alongside other keys instead, use theuri()transform.
- GCLOUD WIDE FLAGS
- These flags are available to all commands:
--access-token-file,--account,--billing-project,--configuration,--flags-file,--flatten,--format,--help,--impersonate-service-account,--log-http,--project,--quiet,--trace-token,--user-output-enabled,--verbosity.Run
$gcloud helpfor details.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-09 UTC.