Deployments and endpoints

Google and Partner models and generative AI features on Vertex AI areexposed as specificregional endpoints and a globalendpoint. Global endpoints cover the entire world and provide higheravailability and reliability than single regions.

Important: Endpoints don't guarantee data residency or in-region MLprocessing. For information about data residency, seeData residency.

Global endpoint

Selecting a global endpoint for your requests can improve overall availabilitywhile reducing resource exhausted (429) errors. Don't use the global endpoint ifyou have ML processing requirements, because you can't control or know whichregion your ML processing requests are sent to when a request is made.

Supported models

Usage of the global endpoint is supported for the following Google models inspecified regions. For details about which regions support the global endpoint,see theGlobal tab in theGoogle model endpoint locations table.

For information about global endpoint availability for partner models, see theGlobal tab in theGoogle Cloud partner model endpoint locations table.

Use the global endpoint

To use the global endpoint, exclude the location from the endpoint name andconfigure the location of the resource toglobal. For example, the followingis global endpoint URL:

https://aiplatform.googleapis.com/v1/projects/test-project/locations/global/publishers/google/models/gemini-2.0-flash-001:generateContent

For theGoogle Gen AI SDK, create a client that uses theglobal location:

client=genai.Client(vertexai=True,project='PROJECT_ID',location='global')

For theVertex AI SDK for Python,initialize the SDK using theglobal location:

importvertexaifromvertexai.generative_modelsimportGenerativeModelvertexai.init(project='PROJECT_ID',location='global')

Limitations

The following capabilities are not available when using the global endpoint:

  • Tuning
  • Batch prediction for Anthropic and OpenMaaS models
  • Retrieval-augmented generation (RAG) corpus (RAG requests are supported)

Usage of the global endpoint with Provisioned Throughput isavailable only for the following models:

ModelLatest supported model version
Gemini 3 Flash (preview)gemini-3-flash-preview
Gemini 3 Pro (preview)gemini-3-pro-preview
Gemini 3 Pro Image (preview)gemini-3-pro-image-preview
Gemini 2.5 Progemini-2.5-pro
Gemini 2.5 Flash (preview)gemini-2.5-flash-preview-09-2025
Gemini 2.5 Flash-Lite (preview)gemini-2.5-flash-lite-preview-09-2025
Gemini 2.5 Flash Imagegemini-2.5-flash-image
Gemini 2.5 Flashgemini-2.5-flash
Gemini 2.5 Flash-Litegemini-2.5-flash-lite
Gemini 2.0 Flashgemini-2.0-flash-001
Gemini 2.0 Flash-Litegemini-2.0-flash-lite-001

Google model endpoint locations

Google model endpoints for Generative AI on Vertex AI are availablein the following regions.

United States

Columbus, Ohio (us-east5)Dallas, Texas (us-south1)Iowa (us-central1)Las Vegas, Nevada (us-west4)Moncks Corner, South Carolina (us-east1)Northern Virginia (us-east4)Oregon (us-west1)
Gemini 3 Pro
(gemini-3-pro-preview)
Gemini 3 Pro Image
(gemini-3-pro-image-preview)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash Image
(gemini-2.5-flash-image)
Gemini 2.5 Flash
(gemini-2.5-flash)
Gemini 2.5 Pro
(gemini-2.5-pro)
Gemini 2.5 Flash-Lite
(gemini-2.5-flash-lite)
Gemini 2.0 Flash
(gemini-2.0-flash-001)
Gemini 2.0 Flash
(gemini-2.0-flash-001)
Gemini 2.0 Flash-Lite
(gemini-2.0-flash-lite-001)
Gemini Embeddings
(gemini-embedding-001)
Embeddings for Text
Embeddings for Multimodal
Imagen
(imagegeneration@002)
Imagen 2
(imagegeneration@005)
Imagen 2
(imagegeneration@006)
Imagen 3
(imagen-3.0-generate-001)
Imagen 3 Fast
(imagen-3.0-fast-generate-001)
Imagen 3 Editing and Customization
(imagen-3.0-capability-001)
Imagen 3
(imagen-3.0-generate-002)
Imagen 4
(imagen-4.0-generate-001)
Imagen 4
(imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate experimental
(imagen-4.0-ultra-generate-001)
Veo 2
(veo-2.0-generate-001)
Veo 3
(veo-3.0-generate-001)
Veo 3 Fast
(veo-3.0-fast-generate-001)
Veo 3 (Preview)
(veo-3.0-generate-preview)
Veo 3 Fast (Preview)
(veo-3.0-fast-generate-preview)
Veo 3.1
(veo-3.1-generate-001)
Veo 3.1 Fast
(veo-3.1-fast-generate-001)
Veo 3.1 (Preview)
(veo-3.1-generate-preview)
Veo 3.1 Fast (Preview)
(veo-3.1-fast-generate-preview)
Chirp 3: Transcription (chirp_3)
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Flash TTS
(gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS
(gemini-2.5-flash-lite-preview-tts)
Gemini 2.5 Pro TTS
(gemini-2.5-pro-tts)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

Canada

Montréal (northamerica-northeast1)
Gemini 3 Pro
(gemini-3-pro)
Gemini 3 Pro Image
(gemini-3-pro-image-preview)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash Image
(gemini-2.5-flash-image)
Gemini 2.5 Flash
(gemini-2.5-flash)
Gemini 2.5 Pro
(gemini-2.5-pro)
Gemini 2.5 Flash-Lite
(gemini-2.5-flash-lite)
Gemini 2.0 Flash
(gemini-2.0-flash-001)
Gemini 2.0 Flash-Lite
(gemini-2.0-flash-lite-001)
Gemini Embeddings
(gemini-embedding-001)
Embeddings for Text
Embeddings for Multimodal
Imagen
(imagegeneration@002)
Imagen 2
(imagegeneration@005)
Imagen 2
(imagegeneration@006)
Imagen 3
(imagen-3.0-generate-001)
Imagen 3 Fast
(imagen-3.0-fast-generate-001)
Imagen 3 Editing and Customization
(imagen-3.0-capability-001)
Imagen 3
(imagen-3.0-generate-002)
Imagen 4
(imagen-4.0-generate-001)
Imagen 4
(imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate experimental
(imagen-4.0-ultra-generate-001)
Chirp 3: Transcription (chirp_3)
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Flash TTS
(gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS
(gemini-2.5-flash-lite-preview-tts)
Gemini 2.5 Pro TTS
(gemini-2.5-pro-tts)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

South America

São Paulo, Brazil (southamerica-east1)
Gemini 3 Pro
(gemini-3-pro)
Gemini 3 Pro Image
(gemini-3-pro-image-preview)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash Image
(gemini-2.5-flash-image)
Gemini 2.5 Flash
(gemini-2.5-flash)
Gemini 2.5 Pro
(gemini-2.5-pro)
Gemini 2.5 Flash-Lite
(gemini-2.5-flash-lite)
Gemini 2.0 Flash
(gemini-2.0-flash-001)
Gemini 2.0 Flash-Lite
(gemini-2.0-flash-lite-001)
Gemini Embeddings
(gemini-embedding-001)
Embeddings for Text
Embeddings for Multimodal
Imagen
(imagegeneration@002)
Imagen 2
(imagegeneration@005)
Imagen 2
(imagegeneration@006)
Imagen 3
(imagen-3.0-generate-001)
Imagen 3 Fast
(imagen-3.0-fast-generate-001)
Imagen 3 Editing and Customization
(imagen-3.0-capability-001)
Imagen 3
(imagen-3.0-generate-002)
Imagen 4
(imagen-4.0-generate-001)
Imagen 4
(imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate experimental
(imagen-4.0-ultra-generate-001)
Chirp 3: Transcription (chirp_3)
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Flash TTS
(gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS
(gemini-2.5-flash-lite-preview-tts)
Gemini 2.5 Pro TTS
(gemini-2.5-pro-tts)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

Europe

Netherlands (europe-west4)Paris, France (europe-west9)London, United Kingdom (europe-west2)Frankfurt, Germany (europe-west3)Belgium (europe-west1)Zürich, Switzerland (europe-west6)Madrid, Spain (europe-southwest1)Milan, Italy (europe-west8)Finland (europe-north1)Warsaw, Poland (europe-central2)
Gemini 3 Pro
(gemini-3-pro)
Gemini 3 Pro Image
(gemini-3-pro-image-preview)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash Image
(gemini-2.5-flash-image)
Gemini 2.5 Flash
(gemini-2.5-flash)
Gemini 2.5 Pro
(gemini-2.5-pro)
Gemini 2.5 Flash-Lite
(gemini-2.5-flash-lite)
Gemini 2.0 Flash
(gemini-2.0-flash-001)
Gemini 2.0 Flash-Lite
(gemini-2.0-flash-lite-001)
Gemini Embeddings
(gemini-embedding-001)
Embeddings for Text
Embeddings for Multimodal
Imagen
(imagegeneration@002)
Imagen 2
(imagegeneration@005)
Imagen 2
(imagegeneration@006)
Imagen 3
(imagen-3.0-generate-001)
Imagen 3 Fast
(imagen-3.0-fast-generate-001)
Imagen 3 Editing and Customization
(imagen-3.0-capability-001)
Imagen 3
(imagen-3.0-generate-002)
Imagen 4
(imagen-4.0-generate-001)
Imagen 4
(imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate experimental
(imagen-4.0-ultra-generate-001)
Chirp 3: Transcription (chirp_3)
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Flash TTS
(gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS
(gemini-2.5-flash-lite-preview-tts)
Gemini 2.5 Pro TTS
(gemini-2.5-pro-tts)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

Asia Pacific

Tokyo, Japan (asia-northeast1)Sydney, Australia (australia-southeast1)Singapore (asia-southeast1)Seoul, Korea (asia-northeast3)Taiwan (asia-east1)Hong Kong, China (asia-east2)Mumbai, India (asia-south1)
Gemini 3 Pro
(gemini-3-pro)
Gemini 3 Pro Image
(gemini-3-pro-image-preview)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash Image
(gemini-2.5-flash-image)
Gemini 2.5 Flash
(gemini-2.5-flash)
Gemini 2.5 Pro
(gemini-2.5-pro)
Gemini 2.5 Flash-Lite
(gemini-2.5-flash-lite)
Gemini 2.0 Flash
(gemini-2.0-flash-001)
Gemini 2.0 Flash-Lite
(gemini-2.0-flash-lite-001)
Gemini Embeddings
(gemini-embedding-001)
Embeddings for Text
Embeddings for Multimodal
Imagen
(imagegeneration@002)
Imagen 2
(imagegeneration@005)
Imagen 2
(imagegeneration@006)
Imagen 3
(imagen-3.0-generate-001)
Imagen 3 Fast
(imagen-3.0-fast-generate-001)
Imagen 3 Editing and Customization
(imagen-3.0-capability-001)
Imagen 3
(imagen-3.0-generate-002)
Imagen 4
(imagen-4.0-generate-001)
Imagen 4
(imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate experimental
(imagen-4.0-ultra-generate-001)
Chirp 3: Transcription (chirp_3)
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Flash TTS
(gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS
(gemini-2.5-flash-lite-preview-tts)
Gemini 2.5 Pro TTS
(gemini-2.5-pro-tts)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

Middle East

Dammam, Saudi Arabia (me-central2)Doha, Qatar (me-central1)Tel Aviv, Israel (me-west1)
Gemini 3 Pro
(gemini-3-pro)
Gemini 3 Pro Image
(gemini-3-pro-image-preview)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash Image
(gemini-2.5-flash-image)
Gemini 2.5 Flash
(gemini-2.5-flash)
Gemini 2.5 Pro
(gemini-2.5-pro)
Gemini 2.5 Flash-Lite
(gemini-2.5-flash-lite)
Gemini 2.0 Flash
(gemini-2.0-flash-001)
Gemini 2.0 Flash-Lite
(gemini-2.0-flash-lite-001)
Gemini Embeddings
(gemini-embedding-001)
Embeddings for Text
Embeddings for Multimodal
Imagen
(imagegeneration@002)
Imagen 2
(imagegeneration@005)
Imagen 2
(imagegeneration@006)
Imagen 3
(imagen-3.0-generate-001)
Imagen 3 Fast
(imagen-3.0-fast-generate-001)
Imagen 3 Editing and Customization
(imagen-3.0-capability-001)
Imagen 3
(imagen-3.0-generate-002)
Imagen 4
(imagen-4.0-generate-001)
Imagen 4
(imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate experimental
(imagen-4.0-ultra-generate-001)
Chirp 3: Transcription (chirp_3)
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Flash TTS
(gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS
(gemini-2.5-flash-lite-preview-tts)
Gemini 2.5 Pro TTS
(gemini-2.5-pro-tts)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

Global

Global (global)
Gemini 3 Pro
(gemini-3-pro)
Gemini 3 Pro Image
(gemini-3-pro-image-preview)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash)
Gemini 2.5 Flash with Gemini Live API native audio
(gemini-live-2.5-flash-native-audio)
Gemini 2.5 Flash Image
(gemini-2.5-flash-image)
Gemini 2.5 Flash
(gemini-2.5-flash)
Gemini 2.5 Pro
(gemini-2.5-pro)
Gemini 2.5 Flash-Lite
(gemini-2.5-flash-lite)
Gemini 2.0 Flash
(gemini-2.0-flash-001)
Gemini 2.0 Flash-Lite
(gemini-2.0-flash-lite-001)
Gemini Embeddings
(gemini-embedding-001)
Embeddings for Text
Embeddings for Multimodal
Imagen
(imagegeneration@002)
Imagen 2
(imagegeneration@005)
Imagen 2
(imagegeneration@006)
Imagen 3
(imagen-3.0-generate-001)
Imagen 3 Fast
(imagen-3.0-fast-generate-001)
Imagen 3 Editing and Customization
(imagen-3.0-capability-001)
Imagen 3
(imagen-3.0-generate-002)
Imagen 4
(imagen-4.0-generate-001)
Imagen 4
(imagen-4.0-fast-generate-001)
Imagen 4 Ultra Generate experimental
(imagen-4.0-ultra-generate-001)
Chirp 3: Transcription (chirp_3)
Chirp 2: Transcription (chirp_2)
Gemini 2.5 Flash TTS
(gemini-2.5-flash-tts)
Gemini 2.5 Flash Lite Preview TTS
(gemini-2.5-flash-lite-preview-tts)
Gemini 2.5 Pro TTS
(gemini-2.5-pro-tts)
Chirp 3: HD Voices
Chirp 3: Instant Custom Voice

Google Cloud partner model endpoint locations

Google serves requests from the region that you specified. For some models,Google also offers a global endpoint to improve overall availability and reduceerror rates. The global endpoint can have a separate set of quotas from theregional endpoint and doesn't support data residency requirements. For moreinformation, see the "Regional and global endpoint" section inVertex AI partner models forMaaS.

Partner model endpoints for Generative AI on Vertex AI are available inthe following regions:

United States

Columbus, Ohio (us-east5)Dallas, Texas (us-south1)Iowa (us-central1)Las Vegas, Nevada (us-west4)Moncks Corner, South Carolina (us-east1)Northern Virginia (us-east4)Oregon (us-west1)
Anthropic's Claude Opus 4.5
Anthropic's Claude Sonnet 4.5
Anthropic's Claude Opus 4.1
Anthropic's Claude Haiku 4.5
Anthropic's Claude Opus 4
Anthropic's Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet (deprecated)
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Europe

Netherlands (europe-west4)Belgium (europe-west1)
Anthropic's Claude Opus 4.5
Anthropic's Claude Sonnet 4.5
Anthropic's Claude Opus 4.1
Claude Haiku 4.5
Anthropic's Claude Opus 4
Anthropic's Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet (deprecated)
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Asia Pacific

Singapore (asia-southeast1)Taiwan (asia-east1)
Anthropic's Claude Opus 4.5
Anthropic's Claude Sonnet 4.5
Anthropic's Claude Opus 4.1
Anthropic's Claude Haiku 4.5
Anthropic's Claude Opus 4
Anthropic's Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet (deprecated)
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Global

Global (global)
Anthropic's Claude Opus 4.5
Anthropic's Claude Sonnet 4.5
Anthropic's Claude Opus 4.1
Anthropic's Claude Haiku 4.5
Anthropic's Claude Opus 4
Anthropic's Claude Sonnet 4
Anthropic's Claude 3.7 Sonnet (deprecated)
Anthropic's Claude 3.5 Haiku
Anthropic's Claude 3 Haiku
Mistral Medium 3
Mistral OCR (25.05)
Mistral Small 3.1 (25.03)
Mistral Large (24.07)
Codestral 2
Codestral (24.05)

Google Cloud open model endpoint locations

Google serves requests from the region that you specified. For some models,Google also offers a global endpoint to improve overall availability and reduceerror rates. The global endpoint can have a separate set of quotas from theregional endpoint and doesn't support data residency requirements. For moreinformation, see the "Regional and global endpoint" section inVertex AI open models forMaaS.

Open model endpoints for Generative AI on Vertex AI are available inthe following regions:

United States

Columbus, Ohio (us-east5)Dallas, Texas (us-south1)Iowa (us-central1)Las Vegas, Nevada (us-west4)Moncks Corner, South Carolina (us-east1)Northern Virginia (us-east4)Oregon (us-west1)
DeepSeek R1 (0528)
DeepSeek-OCR
DeepSeek-V3.1
gpt-oss 120B
gpt-oss 20B
Kimi K2 Thinking
Llama 3.1 8B (Preview)
Llama 3.1 70B (Preview)
Llama 3.1 405B
Llama 3.2 90B (Preview)
Llama 3.3 70B (Preview)
Llama 4 Maverick 17B-128E (Preview)
Llama 4 Scout 17B-16E (Preview)
MiniMax M2
Multilingual E5 Large
Multilingual E5 Small
Qwen3 235B
Qwen3 Coder
Qwen3-Next-80B Instruct
Qwen3-Next-80B Thinking

Europe

Netherlands (europe-west4)Belgium (europe-west1)
DeepSeek R1 (0528)
DeepSeek-OCR
DeepSeek-V3.1
gpt-oss 120B
gpt-oss 20B
Kimi K2 Thinking
Llama 3.1 8B (Preview)
Llama 3.1 70B (Preview)
Llama 3.1 405B
Llama 3.2 90B (Preview)
Llama 3.3 70B (Preview)
Llama 4 Maverick 17B-128E (Preview)
Llama 4 Scout 17B-16E (Preview)
MiniMax M2
Multilingual E5 Large
Multilingual E5 Small
Qwen3 235B
Qwen3 Coder
Qwen3-Next-80B Instruct
Qwen3-Next-80B Thinking

Asia Pacific

Singapore (asia-southeast1)Taiwan (asia-east1)
DeepSeek R1 (0528)
DeepSeek-OCR
DeepSeek-V3.1
gpt-oss 120B
gpt-oss 20B
Kimi K2 Thinking
Llama 3.1 8B (Preview)
Llama 3.1 70B (Preview)
Llama 3.1 405B
Llama 3.2 90B (Preview)
Llama 3.3 70B (Preview)
Llama 4 Maverick 17B-128E (Preview)
Llama 4 Scout 17B-16E (Preview)
MiniMax M2
Multilingual E5 Large
Multilingual E5 Small
Qwen3 235B
Qwen3 Coder
Qwen3-Next-80B Instruct
Qwen3-Next-80B Thinking

Global

Global (global)
DeepSeek R1 (0528)
DeepSeek-OCR
DeepSeek-V3.1
gpt-oss 120B
gpt-oss 20B
Kimi K2 Thinking
Llama 3.1 8B (Preview)
Llama 3.1 70B (Preview)
Llama 3.1 405B
Llama 3.2 90B (Preview)
Llama 3.3 70B (Preview)
Llama 4 Maverick 17B-128E (Preview)
Llama 4 Scout 17B-16E (Preview)
MiniMax M2
Multilingual E5 Large
Multilingual E5 Small
Qwen3 235B
Qwen3 Coder
Qwen3-Next-80B Instruct
Qwen3-Next-80B Thinking

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.