Model Armor integration with Google Cloud services

Model Armor integrates with various Google Cloud services:

  • Google Kubernetes Engine (GKE) and Service Extensions
  • Vertex AI
  • Gemini Enterprise
  • Google Cloud MCP servers (Preview)

GKE and Service Extensions

Model Armor can be integrated with GKE throughService Extensions. Service Extensions allow you to integrateinternal (Google Cloud services) or external (user-managed) services to processtraffic. You can configure a service extension on application load balancers,including GKE inference gateways, to screen traffic to and from aGKE cluster. This verifies that all interactions with the AI modelsare protected by Model Armor. For more information, seeIntegration with GKE.

Vertex AI

Model Armor can be directly integrated into Vertex AI using eitherfloor settings ortemplates.This integration screens Gemini model requests and responses, blockingthose that violate floor settings. This integration provides prompt and responseprotection within Gemini API in Vertex AI for thegenerateContent method. You need to enable Cloud Logging to get visibilityinto the sanitization results of prompts and responses. For more information, seeIntegration with Vertex AI.

Gemini Enterprise

Model Armor can be directly integrated with Gemini Enterpriseusingtemplates.Gemini Enterprise routes the interactions between usersand agents and the underlying LLMs through Model Armor. This meansprompts from users or agents, and the responses generated by the LLMs, areinspected by Model Armor before being presented to the user. Formore information, seeIntegration with Gemini Enterprise.

Google Cloud MCP servers

Preview

This feature is subject to the "Pre-GA Offerings Terms" in the General Service Terms section of theService Specific Terms. Pre-GA features are available "as is" and might have limited support. For more information, see thelaunch stage descriptions.

Model Armor can be configured to help protect your data andsecure content when sending requests to Google Cloud services that exposeModel Context Protocol (MCP) tools and servers. Model Armor helpssecure your agentic AI applications by sanitizing MCP tool calls and responsesusingfloor settings. This processmitigates risks such as prompt injection and sensitive data disclosure. For moreinformation, seeIntegration withGoogle Cloud MCP servers.

Note: The default quota for the Model Armor API is 1200 queriesper minute (QPM). If the service you are integrating with receives high traffic,you might exceed the default Model Armor quota, potentiallycausing requests to be throttled or fail. To avoid this, ensure yourModel Armor quota is sufficient. You canincrease this quota in the Google Cloud consoleand you canset up alerts when you reach quota limits.

Before you begin

Enable APIs

You must enable Model Armor APIs before you can use Model Armor.

Console

  1. Enable the Model Armor API.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

    Enable the API

  2. Select the project where you want to activate Model Armor.

gcloud

Before you begin, follow these steps using the Google Cloud CLI with theModel Armor API:

  1. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, aCloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  2. Run the following command to set the API endpoint for theModel Armor service.

    gcloudconfigsetapi_endpoint_overrides/modelarmor"https://modelarmor.LOCATION.rep.googleapis.com/"

    ReplaceLOCATION with the region where you want to useModel Armor.

Options when integrating Model Armor

Model Armor offers the following integration options. Each option provides differentfeatures and capabilities.

Integration optionPolicy enforcer/detectorConfigure detectionsInspect onlyInspect and blockModel and cloud coverage
REST APIDetectorOnly usingtemplatesYesYesAll models and all clouds
Vertex AIInline enforcementUsingfloor settings ortemplatesYesYesGemini (non-streaming) on Google Cloud
Google Kubernetes EngineInline enforcementOnly usingtemplatesYesYesModels with OpenAI format on Google Cloud
Gemini EnterpriseInline enforcementOnly usingtemplatesYesYesAll models and all clouds
Google Cloud MCP servers (Preview)Inline enforcementOnly usingfloor settingsYesYesMCP on Google Cloud

For the REST API integration option, Model Armor functions only as adetector using templates. This means it identifies and reportspotential policy violations based on predefined templates rather than activelypreventing them. When integrating with the Model Armor API, yourapplication can use its output to block or allow actions basedon the security evaluation results provided. The Model Armor API returns informationabout potential threats or policy violations related to your API traffic,especially in the case of AI/LLM interactions. Your application can call theModel Armor API and use the information received in the response to makea decision and take action based on your predefined custom logic.

With the Vertex AI integration option, Model Armor providesinline enforcement using floor settings or templates. This meansModel Armor actively enforces policies by intervening directlyin the process without requiring modifications to your application code.

The GKE and Gemini Enterprise integrations only use templatesfor inline policy enforcement. This means that Model Armor canenforce policies directly without requiring you to modify application code bothwithin the GKE inference gateway and during user or agentinteractions within Gemini Enterprise instances.

Model Armor and Gemini Enterprise integration sanitizes onlythe initial user prompt and the final agent or model response. Any intermediatesteps that occur between the initial user prompt and the final response generationare not covered by this integration.

Model Armor in Security Command Center

Model Armor inspects LLM prompts and responses for various threats,including prompt injection, jailbreak attempts, malicious URLs, and harmfulcontent. When Model Armordetects a violation of a configured floor setting,it blocks the prompt or response and sends a finding to Security Command Center. For moreinformation, seeModel Armor findings.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2026-02-19 UTC.