SemanticCacheLookup policy

This pageapplies toApigee andApigee hybrid.

View Apigee Edge documentation.

The SemanticCacheLookup policy is an advanced caching policy designed to optimize the performance of AI workloads, particularly those involving Large Language Models (LLMs).

The policy uses the Vertex AI Text embeddings API to generate embeddings for text andVector Search to find similar prompts based on semantic similarity, rather than exact matches.

The SemanticCacheLookup policy reduces response times for repeated queries and optimizes costs by reducing call volume to LLMs.

This policy works in conjunction with theSemanticCachePopulate policy.

This policy is anExtensible policy and use of this policy might have cost or utilization implications, depending on your Apigee license. For information on policy types and usage implications, seePolicy types.

Before you begin

Before you use the SemanticCacheLookup policy, complete the following tasks:

  • Create a Vertex AI project.
  • Create a Vector Search index.
  • Create a Vertex AI endpoint for the index.
  • Create a SemanticCachePopulate policy.

For more information on completing these tasks, seeGet started with Semantic Caching policies.

Required roles

To get the permissions that you need to apply and use the SemanticCacheLookup policy, ask your administrator to grant you theAI Platform User (roles/aiplatform.user) IAM role on the service account you use to deploy Apigee proxies. For more information about granting roles, seeManage access to projects, folders, and organizations.

You might also be able to get the required permissions throughcustom roles or otherpredefined roles.

Enable APIs

Enable the Compute Engine, Vertex AI, and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains theserviceusage.services.enable permission.Learn how to grant roles.

Enable the APIs

<SemanticCacheLookup> element

Defines a SemanticCacheLookup policy.

Default ValueSeeDefault Policy tab, below
Required?Required
TypeComplex object
Parent Element N/A
Child Elements<DisplayName>
<IgnoreUnresolvedVariables>
<UserPromptSource>
<Embeddings>
<SimilaritySearch>

The<SemanticCacheLookup> element uses the following syntax:

Syntax

The<SemanticCacheLookup> element uses the following syntax:

<SemanticCacheLookupasync="false"continueOnError="false"enabled="true"name="SCL-lookup">  <DisplayName>SCL-lookup</DisplayName>  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>  <UserPromptSource>{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}</UserPromptSource>  <Embeddings>    <VertexAI>      <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>    </VertexAI>  </Embeddings>  <SimilaritySearch>    <VertexAI>      <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>      <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>      <Threshold>0.95</Threshold>    </VertexAI>  </SimilaritySearch></SemanticCacheLookup>

Default Policy

The following example shows the default settings when you add a SemanticCacheLookup policy to your flow in the Apigee UI:

<SemanticCacheLookupasync="false"continueOnError="false"enabled="true"name="SCL-lookup">  <DisplayName>SCL-lookup</DisplayName>  <IgnoreUnresolvedVariables>false</IgnoreUnresolvedVariables>  <UserPromptSource>{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}</UserPromptSource>  <Embeddings>    <VertexAI>      <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict      </URL>    </VertexAI>  </Embeddings>  <SimilaritySearch>    <VertexAI>      <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>      <Threshold>0.9</Threshold>      <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>    </VertexAI>  </SimilaritySearch></SemanticCacheLookup>

When you insert a new SemanticCacheLookup policy in the Apigee UI, the template contains stubs for allpossible operations. See below for information on required elements.

This element has the following attributes that are common to all policies:

AttributeDefaultRequired?Description
nameN/ARequired

The internal name of the policy. The value of thename attribute can contain letters, numbers, spaces, hyphens, underscores, and periods. This value cannot exceed 255 characters.

Optionally, use the<DisplayName> element to label the policy in the management UI proxy editor with a different, natural-language name.

continueOnErrorfalseOptionalSet tofalse to return an error when a policy fails. This is expected behavior for most policies. Set totrue to have flow execution continue even after a policy fails. See also:
enabledtrueOptionalSet totrue to enforce the policy. Set tofalse toturn off the policy. The policy will not be enforced even if it remains attached to a flow.
async  falseDeprecatedThis attribute is deprecated.

The following table provides a high level description of the child elements of<SemanticCacheLookup>:

Child ElementRequired?Description
<DisplayName>OptionalThe name of the policy.

<IgnoreUnresolvedVariables>OptionalDetermines whether processing stops when a variable is unresolved. Set totrue to ignore unresolved variables and continue processing.
<UserPromptSource>OptionalThe location of the payload for the user prompt text to be extracted. Only string text values are supported.

This field supports Apigee message template syntax, including the use of variables orJSON Path functions.

For example:

{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}

<Embeddings>RequiredElement containing the information required to generate embeddings.
<SimilaritySearch>RequiredElement containing the information required to perform similarity searches.

For more information, seeQuery public index to get nearest neighbors.

Child element reference

This section describes the child elements of<SemanticCacheLookup>.

<DisplayName>

Use in addition to thename attribute to label the policy in the management UI proxy editor with a different, more natural-sounding name.

The<DisplayName> element is common to all policies.

Default ValueN/A
Required?Optional. If you omit<DisplayName>, the value of the policy'sname attribute is used.
TypeString
Parent Element <PolicyElement>
Child Elements None

The<DisplayName> element uses the following syntax:

Syntax

<PolicyElement><DisplayName>POLICY_DISPLAY_NAME</DisplayName>  ...</PolicyElement>

Example

<PolicyElement><DisplayName>My Validation Policy</DisplayName></PolicyElement>

The<DisplayName> element has no attributes or child elements.

<IgnoreUnresolvedVariables>

Determines whether processing stops when a variable is unresolved. Set totrue to ignore unresolved variables and continue processing.

IgnoreUnresolvedVariables is not applicable when<DefaultValue>is provided.

Default ValueFalse
Required?Optional
TypeBoolean
Parent Element<SemanticCacheLookup>
Child Elements None

<UserPromptSource>

The location of the payload for the user prompt text to be extracted. Only string text values are supported.

This field supports Apigee message template syntax, including the use ofvariables orJSON Path functions.

For example:

{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}
Default Value{jsonPath('$.contents[-1].parts[-1].text',request.content,true)}
Required?Optional
TypeString
Parent Element<SemanticCacheLookup>
Child Elements None

<Embeddings>

This element contains the information required to generate text embeddings.

Default ValueN/A
Required?Optional
TypeString
Parent Element<SemanticCacheLookup>
Child Elements<VertexAI>

The<Embeddings> element uses the following syntax:

<Embeddings>  <VertexAI>    <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>  </VertexAI></Embeddings>

<VertexAI> (child of<Embeddings>)

Contains the <URL> element for Vertex AI-specific attributes.

Default ValueN/A
Required?Required
TypeString
Parent Element<Embeddings>
Child Elements<URL>

TheVertexAI element uses the following syntax:

<VertexAI>  <URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL></VertexAI>

<URL> (child of<VertexAI>)

The URL used to generate text embeddings. See Supported models for a list of models that provide text embeddings for the SemanticCacheLookup policy.

Default ValueN/A
Required?Required
TypeString
Parent Element<VertexAI>
Child Elements None

TheURL element uses the following syntax:

<URL>https://{LOCATION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{LOCATION}/publishers/google/models/{MODEL_ID}:predict</URL>

TheURL element supports the use of URL templating. If you wish, provide a variable in this elementto hold the value of the URL, as shown in the following example:

<URL>https://{URL_VARIABLE}</URL>

<SimilaritySearch>

This element contains the information required to perform similarity searches.

For more information, seeQuery public index to get nearest neighbors.

Default ValueN/A
Required?Required
TypeString
Parent Element<SemanticCacheLookup>
Child Elements<VertexAI>

The<SimilaritySearch> element uses the following syntax:

<SimilaritySearch>  <VertexAI>    <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors    </URL>    <Threshold>0.9</Threshold>    <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID>  </VertexAI></SimilaritySearch>

<VertexAI> (child of<SimilaritySearch>)

Contains the <URL> element for Vertex AI-specific attributes.

Default ValueN/A
Required?Required
TypeString
Parent Element<SimilaritySearch>
Child Elements<URL>

TheVertexAI element uses the following syntax:

<VertexAI>  <URL>https://{PUBLIC_DOMAIN_NAME}/v1/projects/{PROJECT_ID}/locations/{LOCATION}/indexEndpoints/{INDEX_ENDPOINT_ID}:findNeighbors</URL>  <Threshold>0.9</Threshold>  <DeployedIndexID>{DEPLOYED_INDEX_ID}</DeployedIndexID></VertexAI>

The following table provides a high-level description of the child elements of<VertexAI>.

Child ElementRequired?Description
<URL>RequiredString

The URL used to perform similarity searches. The highest matching data point, based on the similarity threshold, is the only data point used.

TheURL element supports the use of URL templating. If you wish, provide a variable in this elementto hold the value of the URL, as shown in the following example:

<URL>https://{URL_VARIABLE}</URL>

<Threshold>OptionalString

Similarity score used to determine if two prompts are considered a match. A value between 0 and 1.

The default value is 0.9.

See

<DeployedIndexID>RequiredString

The ID of the index deployed on the index endpoint used for semantic caching.

Flow variables

Flow variables configure dynamic runtime behavior for policies and flows, basedon HTTP headers or message content, or the context available in the Flow. For more informationabout flow variables, seeFlow variables reference.

This policy provides the following set ofread-only flow variables during execution. You can use these flow variables with theDataCapture policy to create custom analytics reports. For more information, seeCollecting customer data with the Data Capture policy.

Variable nameDescription
request.contentContains the full content of the incoming API request.
request.urlContains the URL of the incoming API request.
semanticcache.lookup.policy_name.user_promptContains specific components extracted from the request prompt, which is used for generating embeddings or performing similarity searches.
semanticcache.lookup.policy_name.embeddings_requestContains the request payload sent to the Vertex AI Embeddings API to generate text embeddings for the input text.
semanticcache.lookup.policy_name.embeddings_responseContains the response from the Vertex AI Embeddings API, which includes the generated text embeddings.
semanticcache.lookup.policy_name.dense_embeddingsContains the actual numerical embedding values generated by the Vertex AI Embeddings API.
semanticcache.lookup.policy_name.is_nearest_neighbor_hitSpecifies whether a nearest neighbor was found in the vector database for the given request and datapoint meets similarity threshold.
semanticcache.lookup.policy_name.cache_hitSpecifies whether the response was found in the semantic cache.
semanticcache.lookup.policy_name.cached_llm_responseContains the response retrieved from the semantic cache (if a cache hit occurred).

Error reference

This section describes the fault codes and error messages that Apigee returns and the fault variables that Apigee sets, specific to the<SemanticCacheLookup> policy. This information is important to know if you are developing fault rules to handle faults. To learn more, seeWhat you need to know about policy errors andHandlingfaults.

Runtime errors

These errors occur when the policy executes.

Fault codeHTTP statusCause
steps.semanticcache.lookup.MessageTemplateExtractionFailed400Failed to extract data from the request using the JSON Path expression.
steps.semanticcache.lookup.FailedToExtractUserPrompt500Unable to extract the user prompt from the API request.
steps.semanticcache.lookup.EmbeddingsServiceUnavailable400The Vertex AI Embeddings service is currently unavailable.
steps.semanticcache.lookup.EmbeddingsAPIFailed400The Vertex AI Embeddings service failed.
steps.semanticcache.lookup.VectorSearchServiceUnavailable400The Vertex AI Vector Search service is currently unavailable.
steps.semanticcache.lookup.VectorSearchAPIFailed400The Vertex AI Vector Search service failed.
steps.semanticcache.lookup.AuthenticationFailure500The service account doesn't have required permissions.
steps.semanticcache.lookup.InternalError500An unexpected error occurred within the SemanticCacheLookup policy.
steps.semanticcache.lookup.CalloutError500The Vertex AI service call failed.

Deployment errors

These errors occur when you deploy a proxy containing this policy.

Error nameCause
The Embeddings/VertexAI element is required.Occurs if the <VertexAI> element in <Embeddings> is empty.
The SimilaritySearch/VertexAI element is required.Occurs if the <VertexAI> element in <SimilaritySearch> is empty.
The Embeddings/URL element is required.Occurs if the <URL> element in <Embeddings> is empty.
The SimilaritySearch/URL element is required.Occurs if the <URL> element in <SimilaritySearch> is empty.
Embeddings URL {url} is invalid.Occurs if the <URL> element in <Embeddings> is empty or invalid.
The SimilaritySearch URL {url} is invalid.Occurs if the <URL> element in <SimilaritySearch> is empty or invalid.
The scheme {http-scheme} of Embeddings URL {url} must be one of http, https.Occurs if the Embeddings <URL> element'shttp scheme is invalid.
The scheme {http-scheme} of SimilaritySearch URL {url} must be one of http, https.Occurs if the SimilaritySearch <URL> element'shttp scheme is invalid.
SimilaritySearch/Threshold element must be >= 0 and<= 1.If the attribute is not between 0 and 1, then the deployment of the API proxy fails.
SimilaritySearch/DeployedIndexID element is required.Occurs if the <DeployedIndexID> element in <SimilaritySearch> is empty.
SimilaritySearch/DeployedIndexID element must not contain spaces.Occurs if the <DeployedIndexID> element in <SimilaritySearch> contains spaces.

Fault variables

This policy sets these variables when it triggers an error at runtime. For more information, seeWhat you need to know about policy errors.

VariablesWhereExample
fault.name="FAULT_NAME"FAULT_NAME is the name of the fault, as listed in theRuntime errors table above. The fault name is the last part of the fault code.fault.name Matches "UnresolvedVariable"
semanticcachelookup.POLICY_NAME.failedPOLICY_NAME is the user-specified name of the policy that threw the fault.semanticcachelookup.SC-lookup.failed = true

Example error response

Note: For error handling, the best practice is to trap theerrorcode part of the error response. Do not rely on the text in thefaultstring, because it could change.
{"fault":{"faultstring":"SemanticCacheLookup[SC-lookup]: unable to resolve variable [variable_name]","detail":{"errorcode":"steps.semanticcachelookup.UnresolvedVariable"}}}

Example fault rule

<FaultRule name="SemanticCacheLookup Faults">    <Step>        <Name>SCL-CustomSetVariableErrorResponse</Name>        <Condition>(fault.name = "SetVariableFailed")</Condition>    </Step>    <Condition>(semanticcachelookup.failed = true)</Condition></FaultRule>

Schemas

Each policy type is defined by an XML schema (.xsd). For reference,policy schemas are available on GitHub.

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-17 UTC.