Create a context cache Stay organized with collections Save and categorize content based on your preferences.
You must create a context cache before you can use it. The context cache youcreate contains a large amount of data that you can use in multiple requests toa Gemini model. The cached content is stored in the region where you make therequest to create the cache.
Cached content can be any of the MIME types supported by Gemini multimodalmodels. For example, you can cache a large amount of text, audio, or video. Youcan specify more than one file to cache. For more information, see the followingmedia requirements:
You specify the content to cache using a blob, text, or a path to a file that'sstored in a Cloud Storage bucket. If the size of the content you're cachingis greater than 10 MB, then you must specify it using the URI of a file that'sstored in a Cloud Storage bucket. For instructions on how to create aCloud Storage bucket to host your file, seeCreate buckets.
Cached content has a finite lifespan. The default expiration time of a contextcache is 60 minutes after it's created. If you want a different expiration time,you can specify a different expiration time using thettl or theexpire_timeproperty when you create a context cache. You can also update the expirationtime for an unexpired context cache. For information about how to specifyttl andexpire_time, seeUpdate the expiration time.
After a context cache expires, it's no longer available. If you want toreference the content in an expired context cache in future prompt requests,then you need to recreate the context cache.
Location support
Context caching isn't supported in the Sydney, Australia(australia-southeast1) region.
Context caching supports theglobal endpoint.
Encryption key support
Context caching supports Customer-Managed Encryption Keys (CMEKs), allowing youto control the encryption of your cached data and protect your sensitiveinformation with encryption keys that you manage and own. This provides anadditional layer of security and compliance.
Refer tothe example for more details.
CMEK is not supported when using the global endpoint.
Access Transparency support
Context caching supportsAccess Transparency.
Create context cache example
The following examples show how to create a context cache.
Python
Install
pip install --upgrade google-genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True
fromgoogleimportgenaifromgoogle.genai.typesimportContent,CreateCachedContentConfig,HttpOptions,Partclient=genai.Client(http_options=HttpOptions(api_version="v1"))system_instruction="""You are an expert researcher. You always stick to the facts in the sources provided, and never make up new facts.Now look at these research papers, and answer the following questions."""contents=[Content(role="user",parts=[Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",mime_type="application/pdf",),Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",mime_type="application/pdf",),],)]content_cache=client.caches.create(model="gemini-2.5-flash",config=CreateCachedContentConfig(contents=contents,system_instruction=system_instruction,# (Optional) For enhanced security, the content cache can be encrypted using a Cloud KMS key# kms_key_name = "projects/.../locations/.../keyRings/.../cryptoKeys/..."display_name="example-cache",ttl="86400s",),)print(content_cache.name)print(content_cache.usage_metadata)# Example response:# projects/111111111111/locations/.../cachedContents/1111111111111111111# CachedContentUsageMetadata(audio_duration_seconds=None, image_count=167,# text_count=153, total_token_count=43130, video_duration_seconds=None)Go
Learn how to install or update theGo.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True
import("context""encoding/json""fmt""io""time"genai"google.golang.org/genai")//createContentCacheshowshowtocreateacontentcachewithanexpirationparameter.funccreateContentCache(wio.Writer)(string,error){ctx:=context.Background()client,err:=genai.NewClient(ctx, &genai.ClientConfig{HTTPOptions:genai.HTTPOptions{APIVersion:"v1"},})iferr!=nil{return"",fmt.Errorf("failed to create genai client: %w",err)}modelName:="gemini-2.5-flash"systemInstruction:="You are an expert researcher. You always stick to the facts "+"in the sources provided, and never make up new facts. "+"Now look at these research papers, and answer the following questions."cacheContents:=[]*genai.Content{{Parts:[]*genai.Part{{FileData: &genai.FileData{FileURI:"gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",MIMEType:"application/pdf",}},{FileData: &genai.FileData{FileURI:"gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",MIMEType:"application/pdf",}},},Role:"user",},}config:= &genai.CreateCachedContentConfig{Contents:cacheContents,SystemInstruction: &genai.Content{Parts:[]*genai.Part{{Text:systemInstruction},},},DisplayName:"example-cache",TTL:time.Duration(time.Duration.Seconds(86400)),}res,err:=client.Caches.Create(ctx,modelName,config)iferr!=nil{return"",fmt.Errorf("failed to create content cache: %w",err)}cachedContent,err:=json.MarshalIndent(res,""," ")iferr!=nil{return"",fmt.Errorf("failed to marshal cache info: %w",err)}//Seethedocumentation:https://pkg.go.dev/google.golang.org/genai#CachedContentfmt.Fprintln(w,string(cachedContent))//Exampleresponse://{//"name":"projects/111111111111/locations/us-central1/cachedContents/1111111111111111111",//"displayName":"example-cache",//"model":"projects/111111111111/locations/us-central1/publishers/google/models/gemini-2.5-flash",//"createTime":"2025-02-18T15:05:08.29468Z",//"updateTime":"2025-02-18T15:05:08.29468Z",//"expireTime":"2025-02-19T15:05:08.280828Z",//"usageMetadata":{//"imageCount":167,//"textCount":153,//"totalTokenCount":43125//}//}returnres.Name,nil}Java
Learn how to install or update theJava.
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True
importcom.google.genai.Client;importcom.google.genai.types.CachedContent;importcom.google.genai.types.Content;importcom.google.genai.types.CreateCachedContentConfig;importcom.google.genai.types.HttpOptions;importcom.google.genai.types.Part;importjava.time.Duration;importjava.util.Optional;publicclassContentCacheCreateWithTextGcsPdf{publicstaticvoidmain(String[]args){//TODO(developer):Replacethesevariablesbeforerunningthesample.StringmodelId="gemini-2.5-flash";contentCacheCreateWithTextGcsPdf(modelId);}//CreatesacachedcontentusingtextandgcspdfsfilespublicstaticOptional<String>contentCacheCreateWithTextGcsPdf(StringmodelId){//Initializeclientthatwillbeusedtosendrequests.Thisclientonlyneedstobecreated//once,andcanbereusedformultiplerequests.try(Clientclient=Client.builder().location("global").vertexAI(true).httpOptions(HttpOptions.builder().apiVersion("v1").build()).build()){//SetthesysteminstructionContentsystemInstruction=Content.fromParts(Part.fromText("You are an expert researcher. You always stick to the facts"+" in the sources provided, and never make up new facts.\n"+"Now look at these research papers, and answer the following questions."));//SetpdffilesContentcontents=Content.fromParts(Part.fromUri("gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf","application/pdf"),Part.fromUri("gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf","application/pdf"));//ConfigurationforcachedcontentusingpdfsfilesandtextCreateCachedContentConfigconfig=CreateCachedContentConfig.builder().systemInstruction(systemInstruction).contents(contents).displayName("example-cache").ttl(Duration.ofSeconds(86400)).build();CachedContentcachedContent=client.caches.create(modelId,config);cachedContent.name().ifPresent(System.out::println);cachedContent.usageMetadata().ifPresent(System.out::println);//Exampleresponse://projects/111111111111/locations/global/cachedContents/1111111111111111111//CachedContentUsageMetadata{audioDurationSeconds=Optional.empty,imageCount=Optional[167],//textCount=Optional[153],totalTokenCount=Optional[43125],//videoDurationSeconds=Optional.empty}returncachedContent.name();}}}Node.js
Install
npm install @google/genai
To learn more, see the SDK reference documentation.
Set environment variables to use the Gen AI SDK with Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=True
const{GoogleGenAI}=require('@google/genai');constGOOGLE_CLOUD_PROJECT=process.env.GOOGLE_CLOUD_PROJECT;constGOOGLE_CLOUD_LOCATION=process.env.GOOGLE_CLOUD_LOCATION||'global';asyncfunctiongenerateContentCache(projectId=GOOGLE_CLOUD_PROJECT,location=GOOGLE_CLOUD_LOCATION){constclient=newGoogleGenAI({vertexai:true,project:projectId,location:location,httpOptions:{apiVersion:'v1',},});constsystemInstruction=`Youareanexpertresearcher.Youalwayssticktothefactsinthesourcesprovided,andnevermakeupnewfacts.Nowlookattheseresearchpapers,andanswerthefollowingquestions.`;constcontents=[{role:'user',parts:[{fileData:{fileUri:'gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf',mimeType:'application/pdf',},},{fileData:{fileUri:'gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf',mimeType:'application/pdf',},},],},];constcontentCache=awaitclient.caches.create({model:'gemini-2.5-flash',config:{contents:contents,systemInstruction:systemInstruction,displayName:'example-cache',ttl:'86400s',},});console.log(contentCache);console.log(contentCache.name);//Exampleresponse://projects/111111111111/locations/us-central1/cachedContents/1111111111111111111//CachedContentUsageMetadata(audio_duration_seconds=None,image_count=167,//text_count=153,total_token_count=43130,video_duration_seconds=None)returncontentCache.name;}REST
You can use REST to create a context cache by using the Vertex AI API to send a POST request to the publisher model endpoint. The following example shows how to create a context cache using a file stored in a Cloud Storage bucket.
Before using any of the request data, make the following replacements:
- PROJECT_ID: Yourproject ID.
- LOCATION: The region to process the request and where the cached content is stored. For a list of supported regions, seeAvailable regions.
- CACHE_DISPLAY_NAME: A meaningful display name to describe and to help you identify each context cache.
- MIME_TYPE: The MIME type of the content to cache.
- CONTENT_TO_CACHE_URI: The Cloud Storage URI of the content to cache.
- MODEL_ID: The model to use for caching.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents
Request JSON body:
{ "model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID", "displayName": "CACHE_DISPLAY_NAME", "contents": [{ "role": "user", "parts": [{ "fileData": { "mimeType": "MIME_TYPE", "fileUri": "CONTENT_TO_CACHE_URI" } }] }, { "role": "model", "parts": [{ "text": "This is sample text to demonstrate explicit caching." }] }]}To send your request, choose one of these options:
curl
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents"
PowerShell
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
Response
{ "name": "projects/PROJECT_NUMBER/locations/us-central1/cachedContents/CACHE_ID", "model": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-2.0-flash-001", "createTime": "2024-06-04T01:11:50.808236Z", "updateTime": "2024-06-04T01:11:50.808236Z", "expireTime": "2024-06-04T02:11:50.794542Z"}Example curl command
LOCATION="us-central1"MODEL_ID="gemini-2.0-flash-001"PROJECT_ID="test-project"MIME_TYPE="video/mp4"CACHED_CONTENT_URI="gs://path-to-bucket/video-file-name.mp4"curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents-d\'{ "model":"projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}", "contents": [ { "role": "user", "parts": [ { "fileData": { "mimeType": "${MIME_TYPE}", "fileUri": "${CACHED_CONTENT_URI}" } } ] } ]}'Create a context cache with CMEK
To implement context caching with CMEKs, create a CMEK by followingthe instructions and make sure theVertex AI per-product, per-project service account (P4SA) has thenecessary Cloud KMS CryptoKey Encrypter/Decrypter permissions on the key.This lets you securely create and manage cached content as well makeother calls like {List,Update,Delete,Get}CachedContent(s) withoutrepeatedly specifying a KMS key.
REST
You can use REST to create a context cache by using the Vertex AI API to send a POST request to the publisher model endpoint. The following example shows how to create a context cache using a file stored in a Cloud Storage bucket.
Before using any of the request data, make the following replacements:
- PROJECT_ID: .
- LOCATION: The region to process the request and where the cached content is stored. For a list of supported regions, seeAvailable regions.
- MODEL_ID: gemini-2.0-flash-001.
- CACHE_DISPLAY_NAME: A meaningful display name to describe and to help you identify each context cache.
- MIME_TYPE: The MIME type of the content to cache.
- CACHED_CONTENT_URI: The Cloud Storage URI of the content to cache.
- KMS_KEY_NAME: The Cloud KMS key name.
HTTP method and URL:
POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents
Request JSON body:
{ "model": "projects/PROJECT_ID/locations/LOCATION/publishers/google/models/gemini-2.0-flash-001", "displayName": "CACHE_DISPLAY_NAME", "contents": [{ "role": "user", "parts": [{ "fileData": { "mimeType": "MIME_TYPE", "fileUri": "CONTENT_TO_CACHE_URI" } }]}], "encryptionSpec": { "kmsKeyName": "KMS_KEY_NAME" }}To send your request, choose one of these options:
curl
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login , or by usingCloud Shell, which automatically logs you into thegcloud CLI . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents"
PowerShell
Note: The following command assumes that you have logged in to thegcloud CLI with your user account by runninggcloud init orgcloud auth login . You can check the currently active account by runninggcloud auth list. Save the request body in a file namedrequest.json, and execute the following command:
$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }
Invoke-WebRequest `
-Method POST `
-Headers $headers `
-ContentType: "application/json; charset=utf-8" `
-InFile request.json `
-Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/cachedContents" | Select-Object -Expand Content
You should receive a JSON response similar to the following:
Response
{ "name": "projects/PROJECT_NUMBER/locations/us-central1/cachedContents/CACHE_ID", "model": "projects/PROJECT_ID/locations/us-central1/publishers/google/models/gemini-2.0-flash-001", "createTime": "2024-06-04T01:11:50.808236Z", "updateTime": "2024-06-04T01:11:50.808236Z", "expireTime": "2024-06-04T02:11:50.794542Z"}Example curl command
LOCATION="us-central1"MODEL_ID="gemini-2.0-flash-001"PROJECT_ID="test-project"MIME_TYPE="video/mp4"CACHED_CONTENT_URI="gs://path-to-bucket/video-file-name.mp4"KMS_KEY_NAME="projects/${PROJECT_ID}/locations/{LOCATION}/keyRings/your-key-ring/cryptoKeys/your-key"curl-XPOST\-H"Authorization: Bearer$(gcloudauthprint-access-token)"\-H"Content-Type: application/json"\https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/cachedContents-d\'{"model": "projects/{PROJECT_ID}}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}", "contents" : [ { "role": "user", "parts": [ { "file_data": { "mime_type":"{MIME_TYPE}", "file_uri":"{CACHED_CONTENT_URI}" } } ] } ], "encryption_spec" : { "kms_key_name":"{KMS_KEY_NAME}" }}'GenAI SDK for Python
Install
pipinstall--upgradegoogle-genaiTo learn more, see theSDK reference documentation.
Set environment variables to use the Gen AI SDKwith Vertex AI:
# Replace the `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` values# with appropriate values for your project.exportGOOGLE_CLOUD_PROJECT=GOOGLE_CLOUD_PROJECTexportGOOGLE_CLOUD_LOCATION=us-central1exportGOOGLE_GENAI_USE_VERTEXAI=Trueimportosfromgoogleimportgenaifromgoogle.genai.typesimportContent,CreateCachedContentConfig,HttpOptions,Partos.environ['GOOGLE_CLOUD_PROJECT']='vertexsdk'os.environ['GOOGLE_CLOUD_LOCATION']='us-central1'os.environ['GOOGLE_GENAI_USE_VERTEXAI']='True'client=genai.Client(http_options=HttpOptions(api_version="v1"))system_instruction="""You are an expert researcher. You always stick to the facts in the sources provided, and never make up new facts.Now look at these research papers, and answer the following questions."""contents=[Content(role="user",parts=[Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",mime_type="application/pdf",),Part.from_uri(file_uri="gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",mime_type="application/pdf",),],)]content_cache=client.caches.create(model="gemini-2.0-flash-001",config=CreateCachedContentConfig(contents=contents,system_instruction=system_instruction,display_name="example-cache",kms_key_name="projects/vertexsdk/locations/us-central1/keyRings/your-project/cryptoKeys/your-key",ttl="86400s",),)print(content_cache.name)print(content_cache.usage_metadata)GenAI SDK for Go
Learn how to install or update theGen AI SDK for Go.
To learn more, see theSDK reference documentation.
Set environment variables to use the Gen AI SDKwith Vertex AI:
import("context""encoding/json""fmt""io"genai"google.golang.org/genai")// createContentCache shows how to create a content cache with an expiration parameter.funccreateContentCache(wio.Writer)(string,error){ctx:=context.Background()client,err:=genai.NewClient(ctx,&genai.ClientConfig{HTTPOptions:genai.HTTPOptions{APIVersion:"v1beta1"},})iferr!=nil{return"",fmt.Errorf("failed to create genai client: %w",err)}modelName:="gemini-2.0-flash-001"systemInstruction:="You are an expert researcher. You always stick to the facts "+"in the sources provided, and never make up new facts. "+"Now look at these research papers, and answer the following questions."cacheContents:=[]*genai.Content{{Parts:[]*genai.Part{{FileData:&genai.FileData{FileURI:"gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf",MIMEType:"application/pdf",}},{FileData:&genai.FileData{FileURI:"gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf",MIMEType:"application/pdf",}},},Role:"user",},}config:=&genai.CreateCachedContentConfig{Contents:cacheContents,SystemInstruction:&genai.Content{Parts:[]*genai.Part{{Text:systemInstruction},},},DisplayName:"example-cache",KmsKeyName:"projects/vertexsdk/locations/us-central1/keyRings/your-project/cryptoKeys/your-key",TTL:"86400s",}res,err:=client.Caches.Create(ctx,modelName,config)iferr!=nil{return"",fmt.Errorf("failed to create content cache: %w",err)}cachedContent,err:=json.MarshalIndent(res,""," ")iferr!=nil{return"",fmt.Errorf("failed to marshal cache info: %w",err)}// See the documentation: https://pkg.go.dev/google.golang.org/genai#CachedContentfmt.Fprintln(w,string(cachedContent))returnres.Name,nil}What's next
- Learn how touse a context cache.
- Learn how toupdate the expiration time of a context cache.
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2025-12-15 UTC.