gcloud container ai profiles

NAME
gcloud container ai profiles - quickstart engine for GKE AI workloads
SYNOPSIS
gcloud container ai profilesGROUP |COMMAND[GCLOUD_WIDE_FLAG]
DESCRIPTION
The GKE Inference Quickstart helps simplify deploying AI inference on GoogleKubernetes Engine (GKE). It provides tailored profiles based on Google'sinternal benchmarks. Provide inputs like your preferred open-source model (e.g.Llama, Gemma, or Mistral) and your application's performance target. Based onthese inputs, the quickstart generates accelerator choices with performancemetrics, and detailed, ready-to-deploy profiles for compute, load balancing, andautoscaling. These profiles are provided as standard Kubernetes YAML manifests,which you can deploy or modify.

To visualize the benchmarking data that support these estimates, see theaccompanying Colab notebook:https://colab.research.google.com/github/GoogleCloudPlatform/kubernetes-engine-samples/blob/main/ai-ml/notebooks/giq_visualizations.ipynb

GCLOUD WIDE FLAGS
These flags are available to all commands:--help.

Run$gcloud help for details.

GROUPS
GROUP is one of the following:
benchmarks
Manage benchmarks for GKE Inference Quickstart.
manifests
Generate optimized Kubernetes manifests.
model-server-versions
Manage supported model server versions for GKE Inference Quickstart.
model-servers
Manage supported model servers for GKE Inference Quickstart.
models
Manage supported models for GKE Inference Quickstart.
serving-stack-versions
List supported serving stack versions for GKE Inference Quickstart.
serving-stacks
List supported serving stacks for GKE Inference Quickstart.
use-case
List supported use cases for GKE Inference Quickstart.
COMMANDS
COMMAND is one of the following:
list
List compatible accelerator profiles.
NOTES
This variant is also available:
gcloudalphacontaineraiprofiles

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-09 UTC.