Create a context cache

You must create a context cache before you can use it. The context cache youcreate contains a large amount of data that you can use in multiple requests toa Gemini model. The cached content is stored in the region where you make therequest to create the cache.

Cached content can be any of the MIME types supported by Gemini multimodalmodels. For example, you can cache a large amount of text, audio, or video. Youcan specify more than one file to cache. For more information, see the followingmedia requirements:

You specify the content to cache using a blob, text, or a path to a file that'sstored in a Cloud Storage bucket. If the size of the content you're cachingis greater than 10 MB, then you must specify it using the URI of a file that'sstored in a Cloud Storage bucket. For instructions on how to create aCloud Storage bucket to host your file, seeCreate buckets.

Cached content has a finite lifespan. The default expiration time of a contextcache is 60 minutes after it's created. If you want a different expiration time,you can specify a different expiration time using thettl or theexpire_timeproperty when you create a context cache. You can also update the expirationtime for an unexpired context cache. For information about how to specifyttl andexpire_time, seeUpdate the expiration time.

After a context cache expires, it's no longer available. If you want toreference the content in an expired context cache in future prompt requests,then you need to recreate the context cache.

Location support

Context caching isn't supported in the Sydney, Australia(australia-southeast1) region.

Context caching supports theglobal endpoint.

Encryption key support

Context caching supports Customer-Managed Encryption Keys (CMEKs), allowing youto control the encryption of your cached data and protect your sensitiveinformation with encryption keys that you manage and own. This provides anadditional layer of security and compliance.

Refer tothe example for more details.

CMEK is not supported when using the global endpoint.

Access Transparency support

Context caching supportsAccess Transparency.

Create context cache example

The following examples show how to create a context cache.

Python

Install

pip install --upgrade google-genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True

fromgoogleimportgenaifromgoogle.genai.typesimportContent,CreateCachedContentConfig,HttpOptions,Partclient=genai.Client(http_options=HttpOptions(api_version="v1"))system_instruction="""You are an expert researcher. You always stick to the facts in the sources provided, and never make up new facts.Now look at these research papers, and answer the following questions."""contents=[Content(role="user",parts=[Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",mime_type="application/pdf",),Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",mime_type="application/pdf",),],)]content_cache=client.caches.create(model="gemini-2.5-flash",config=CreateCachedContentConfig(contents=contents,system_instruction=system_instruction,# (Optional) For enhanced security, the content cache can be encrypted using a Cloud KMS key# kms_key_name = "projects/.../locations/.../keyRings/.../cryptoKeys/..."display_name="example-cache",ttl="86400s",),)print(content_cache.name)print(content_cache.usage_metadata)# Example response:#   projects/111111111111/locations/.../cachedContents/1111111111111111111#   CachedContentUsageMetadata(audio_duration_seconds=None, image_count=167,#       text_count=153, total_token_count=43130, video_duration_seconds=None)

Go

Learn how to install or update theGo.

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True

import("context""encoding/json""fmt""io""time"genai"google.golang.org/genai")//createContentCacheshowshowtocreateacontentcachewithanexpirationparameter.funccreateContentCache(wio.Writer)(string,error){ctx:=context.Background()client,err:=genai.NewClient(ctx, &genai.ClientConfig{HTTPOptions:genai.HTTPOptions{APIVersion:"v1"},})iferr!=nil{return"",fmt.Errorf("failed to create genai client: %w",err)}modelName:="gemini-2.5-flash"systemInstruction:="You are an expert researcher. You always stick to the facts "+"in the sources provided, and never make up new facts. "+"Now look at these research papers, and answer the following questions."cacheContents:=[]*genai.Content{{Parts:[]*genai.Part{{FileData: &genai.FileData{FileURI:"gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",MIMEType:"application/pdf",}},{FileData: &genai.FileData{FileURI:"gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",MIMEType:"application/pdf",}},},Role:"user",},}config:= &genai.CreateCachedContentConfig{Contents:cacheContents,SystemInstruction: &genai.Content{Parts:[]*genai.Part{{Text:systemInstruction},},},DisplayName:"example-cache",TTL:time.Duration(time.Duration.Seconds(86400)),}res,err:=client.Caches.Create(ctx,modelName,config)iferr!=nil{return"",fmt.Errorf("failed to create content cache: %w",err)}cachedContent,err:=json.MarshalIndent(res,"","  ")iferr!=nil{return"",fmt.Errorf("failed to marshal cache info: %w",err)}//Seethedocumentation:https://pkg.go.dev/google.golang.org/genai#CachedContentfmt.Fprintln(w,string(cachedContent))//Exampleresponse://{//"name":"projects/111111111111/locations/us-central1/cachedContents/1111111111111111111",//"displayName":"example-cache",//"model":"projects/111111111111/locations/us-central1/publishers/google/models/gemini-2.5-flash",//"createTime":"2025-02-18T15:05:08.29468Z",//"updateTime":"2025-02-18T15:05:08.29468Z",//"expireTime":"2025-02-19T15:05:08.280828Z",//"usageMetadata":{//"imageCount":167,//"textCount":153,//"totalTokenCount":43125//}//}returnres.Name,nil}

Java

Learn how to install or update theJava.

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True

importcom.google.genai.Client;importcom.google.genai.types.CachedContent;importcom.google.genai.types.Content;importcom.google.genai.types.CreateCachedContentConfig;importcom.google.genai.types.HttpOptions;importcom.google.genai.types.Part;importjava.time.Duration;importjava.util.Optional;publicclassContentCacheCreateWithTextGcsPdf{publicstaticvoidmain(String[]args){//TODO(developer):Replacethesevariablesbeforerunningthesample.StringmodelId="gemini-2.5-flash";contentCacheCreateWithTextGcsPdf(modelId);}//CreatesacachedcontentusingtextandgcspdfsfilespublicstaticOptional<String>contentCacheCreateWithTextGcsPdf(StringmodelId){//Initializeclientthatwillbeusedtosendrequests.Thisclientonlyneedstobecreated//once,andcanbereusedformultiplerequests.try(Clientclient=Client.builder().location("global").vertexAI(true).httpOptions(HttpOptions.builder().apiVersion("v1").build()).build()){//SetthesysteminstructionContentsystemInstruction=Content.fromParts(Part.fromText("You are an expert researcher. You always stick to the facts"+" in the sources provided, and never make up new facts.\n"+"Now look at these research papers, and answer the following questions."));//SetpdffilesContentcontents=Content.fromParts(Part.fromUri("gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf","application/pdf"),Part.fromUri("gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf","application/pdf"));//ConfigurationforcachedcontentusingpdfsfilesandtextCreateCachedContentConfigconfig=CreateCachedContentConfig.builder().systemInstruction(systemInstruction).contents(contents).displayName("example-cache").ttl(Duration.ofSeconds(86400)).build();CachedContentcachedContent=client.caches.create(modelId,config);cachedContent.name().ifPresent(System.out::println);cachedContent.usageMetadata().ifPresent(System.out::println);//Exampleresponse://projects/111111111111/locations/global/cachedContents/1111111111111111111//CachedContentUsageMetadata{audioDurationSeconds=Optional.empty,imageCount=Optional[167],//textCount=Optional[153],totalTokenCount=Optional[43125],//videoDurationSeconds=Optional.empty}returncachedContent.name();}}}

Node.js

Install

npm install @google/genai

To learn more, see the SDK reference documentation.

Set environment variables to use the Gen AI SDK with Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True

const{GoogleGenAI}=require('@google/genai');constGOOGLE_CLOUD_PROJECT=process.env.GOOGLE_CLOUD_PROJECT;constGOOGLE_CLOUD_LOCATION=process.env.GOOGLE_CLOUD_LOCATION||'global';asyncfunctiongenerateContentCache(projectId=GOOGLE_CLOUD_PROJECT,location=GOOGLE_CLOUD_LOCATION){constclient=newGoogleGenAI({vertexai:true,project:projectId,location:location,httpOptions:{apiVersion:'v1',},});constsystemInstruction=`Youareanexpertresearcher.Youalwayssticktothefactsinthesourcesprovided,andnevermakeupnewfacts.Nowlookattheseresearchpapers,andanswerthefollowingquestions.`;constcontents=[{role:'user',parts:[{fileData:{fileUri:'gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf',mimeType:'application/pdf',},},{fileData:{fileUri:'gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf',mimeType:'application/pdf',},},],},];constcontentCache=awaitclient.caches.create({model:'gemini-2.5-flash',config:{contents:contents,systemInstruction:systemInstruction,displayName:'example-cache',ttl:'86400s',},});console.log(contentCache);console.log(contentCache.name);//Exampleresponse://projects/111111111111/locations/us-central1/cachedContents/1111111111111111111//CachedContentUsageMetadata(audio_duration_seconds=None,image_count=167,//text_count=153,total_token_count=43130,video_duration_seconds=None)returncontentCache.name;}

REST

You can use REST to create a context cache by using the Vertex AI API to send a POST request to the publisher model endpoint. The following example shows how to create a context cache using a file stored in a Cloud Storage bucket.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: Yourproject ID.
  • LOCATION: The region to process the request and where the cached content is stored. For a list of supported regions, seeAvailable regions.
  • CACHE_DISPLAY_NAME: A meaningful display name to describe and to help you identify each context cache.
  • MIME_TYPE: The MIME type of the content to cache.
  • CONTENT_TO_CACHE_URI: The Cloud Storage URI of the content to cache.
  • MODEL_ID: The model to use for caching.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents

Request JSON body:

{  "model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID",  "displayName": "CACHE_DISPLAY_NAME",  "contents": [{    "role": "user",      "parts": [{        "fileData": {          "mimeType": "MIME_TYPE",          "fileUri": "CONTENT_TO_CACHE_URI"        }      }]  },  {    "role": "model",      "parts": [{        "text": "This is sample text to demonstrate explicit caching."      }]  }]}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents"

PowerShell

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Response

{  "name": "projects/PROJECT_NUMBER/locations/us-central1/cachedContents/CACHE_ID",  "model": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-2.0-flash-001",  "createTime": "2024-06-04T01:11:50.808236Z",  "updateTime": "2024-06-04T01:11:50.808236Z",  "expireTime": "2024-06-04T02:11:50.794542Z"}

Example curl command

LOCATION="us-central1"MODEL_ID="gemini-2.0-flash-001"PROJECT_ID="test-project"MIME_TYPE="video/mp4"CACHED_CONTENT_URI="gs://path-to-bucket/video-file-name.mp4"curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents-d\'{  "model":"projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}",  "contents": [    {      "role": "user",      "parts": [        {          "fileData": {            "mimeType": "${MIME_TYPE}",            "fileUri": "${CACHED_CONTENT_URI}"          }        }      ]    }  ]}'

Create a context cache with CMEK

To implement context caching with CMEKs, create a CMEK by followingthe instructions and make sure theVertex AI per-product, per-project service account (P4SA) has thenecessary Cloud KMS CryptoKey Encrypter/Decrypter permissions on the key.This lets you securely create and manage cached content as well makeother calls like {List,Update,Delete,Get}CachedContent(s) withoutrepeatedly specifying a KMS key.

REST

You can use REST to create a context cache by using the Vertex AI API to send a POST request to the publisher model endpoint. The following example shows how to create a context cache using a file stored in a Cloud Storage bucket.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: .
  • LOCATION: The region to process the request and where the cached content is stored. For a list of supported regions, seeAvailable regions.
  • MODEL_ID: gemini-2.0-flash-001.
  • CACHE_DISPLAY_NAME: A meaningful display name to describe and to help you identify each context cache.
  • MIME_TYPE: The MIME type of the content to cache.
  • CACHED_CONTENT_URI: The Cloud Storage URI of the content to cache.
  • KMS_KEY_NAME: The Cloud KMS key name.

HTTP method and URL:

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents

Request JSON body:

{  "model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001",  "displayName": "CACHE_DISPLAY_NAME",  "contents": [{    "role": "user",      "parts": [{        "fileData": {          "mimeType": "MIME_TYPE",          "fileUri": "CONTENT_TO_CACHE_URI"        }      }]}],    "encryptionSpec": {      "kmsKeyName": "KMS_KEY_NAME"    }}

To send your request, choose one of these options:

curl

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents"

PowerShell

Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list.

Save the request body in a file namedrequest.json, and execute the following command:

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents" | Select-Object -Expand Content

You should receive a JSON response similar to the following:

Response

{  "name": "projects/PROJECT_NUMBER/locations/us-central1/cachedContents/CACHE_ID",  "model": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-2.0-flash-001",  "createTime": "2024-06-04T01:11:50.808236Z",  "updateTime": "2024-06-04T01:11:50.808236Z",  "expireTime": "2024-06-04T02:11:50.794542Z"}

Example curl command

LOCATION="us-central1"MODEL_ID="gemini-2.0-flash-001"PROJECT_ID="test-project"MIME_TYPE="video/mp4"CACHED_CONTENT_URI="gs://path-to-bucket/video-file-name.mp4"KMS_KEY_NAME="projects/${PROJECT_ID}/locations/{LOCATION}/keyRings/your-key-ring/cryptoKeys/your-key"curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents-d\'{"model": "projects/{PROJECT_ID}}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}",  "contents" : [    {      "role": "user",      "parts": [        {          "file_data": {            "mime_type":"{MIME_TYPE}",            "file_uri":"{CACHED_CONTENT_URI}"          }        }      ]    }  ],  "encryption_spec" :  {    "kms_key_name":"{KMS_KEY_NAME}"  }}'

GenAI SDK for Python

Install

pipinstall--upgradegoogle-genai

To learn more, see theSDK reference documentation.

Set environment variables to use the Gen AI SDKwith Vertex AI:

# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True
importosfromgoogleimportgenaifromgoogle.genai.typesimportContent,CreateCachedContentConfig,HttpOptions,Partos.environ['GOOGLE_CLOUD_PROJECT']='vertexsdk'os.environ['GOOGLE_CLOUD_LOCATION']='us-central1'os.environ['GOOGLE_GENAI_USE_VERTEXAI']='True'client=genai.Client(http_options=HttpOptions(api_version="v1"))system_instruction="""You are an expert researcher. You always stick to the facts in the sources provided, and never make up new facts.Now look at these research papers, and answer the following questions."""contents=[Content(role="user",parts=[Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",mime_type="application/pdf",),Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",mime_type="application/pdf",),],)]content_cache=client.caches.create(model="gemini-2.0-flash-001",config=CreateCachedContentConfig(contents=contents,system_instruction=system_instruction,display_name="example-cache",kms_key_name="projects/vertexsdk/locations/us-central1/keyRings/your-project/cryptoKeys/your-key",ttl="86400s",),)print(content_cache.name)print(content_cache.usage_metadata)

GenAI SDK for Go

Learn how to install or update theGen AI SDK for Go.

To learn more, see theSDK reference documentation.

Set environment variables to use the Gen AI SDKwith Vertex AI:

import("context""encoding/json""fmt""io"genai"google.golang.org/genai")// createContentCache shows how to create a content cache with an expiration parameter.funccreateContentCache(wio.Writer)(string,error){ctx:=context.Background()client,err:=genai.NewClient(ctx,&genai.ClientConfig{HTTPOptions:genai.HTTPOptions{APIVersion:"v1beta1"},})iferr!=nil{return"",fmt.Errorf("failed to create genai client: %w",err)}modelName:="gemini-2.0-flash-001"systemInstruction:="You are an expert researcher. You always stick to the facts "+"in the sources provided, and never make up new facts. "+"Now look at these research papers, and answer the following questions."cacheContents:=[]*genai.Content{{Parts:[]*genai.Part{{FileData:&genai.FileData{FileURI:"gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",MIMEType:"application/pdf",}},{FileData:&genai.FileData{FileURI:"gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",MIMEType:"application/pdf",}},},Role:"user",},}config:=&genai.CreateCachedContentConfig{Contents:cacheContents,SystemInstruction:&genai.Content{Parts:[]*genai.Part{{Text:systemInstruction},},},DisplayName:"example-cache",KmsKeyName:"projects/vertexsdk/locations/us-central1/keyRings/your-project/cryptoKeys/your-key",TTL:"86400s",}res,err:=client.Caches.Create(ctx,modelName,config)iferr!=nil{return"",fmt.Errorf("failed to create content cache: %w",err)}cachedContent,err:=json.MarshalIndent(res,"","  ")iferr!=nil{return"",fmt.Errorf("failed to marshal cache info: %w",err)}// See the documentation: https://pkg.go.dev/google.golang.org/genai#CachedContentfmt.Fprintln(w,string(cachedContent))returnres.Name,nil}

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.