Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

How to use a time-weighted vector store retriever

Thisretriever uses a combination of semanticsimilarity and a time decay.

The algorithm for scoring them is:

semantic_similarity + (1.0 - decay_rate) ^ hours_passed

Notably,hours_passed refers to the hours passed since the object in the retrieverwas last accessed, not since it was created. This means that frequently accessed objects remain "fresh".

from datetimeimport datetime, timedelta

import faiss
from langchain.retrieversimport TimeWeightedVectorStoreRetriever
from langchain_community.docstoreimport InMemoryDocstore
from langchain_community.vectorstoresimport FAISS
from langchain_core.documentsimport Document
from langchain_openaiimport OpenAIEmbeddings

Low decay rate

A lowdecay rate (in this, to be extreme, we will set it close to 0) means memories will be "remembered" for longer. Adecay rate of 0 means memories never be forgotten, making this retriever equivalent to the vector lookup.

# Define your embedding model
embeddings_model= OpenAIEmbeddings()
# Initialize the vectorstore as empty
embedding_size=1536
index= faiss.IndexFlatL2(embedding_size)
vectorstore= FAISS(embeddings_model, index, InMemoryDocstore({}),{})
retriever= TimeWeightedVectorStoreRetriever(
vectorstore=vectorstore, decay_rate=0.0000000000000000000000001, k=1
)
yesterday= datetime.now()- timedelta(days=1)
retriever.add_documents(
[Document(page_content="hello world", metadata={"last_accessed_at": yesterday})]
)
retriever.add_documents([Document(page_content="hello foo")])
['73679bc9-d425-49c2-9d74-de6356c73489']
# "Hello World" is returned first because it is most salient, and the decay rate is close to 0., meaning it's still recent enough
retriever.invoke("hello world")
[Document(metadata={'last_accessed_at': datetime.datetime(2024, 10, 22, 16, 37, 40, 818583), 'created_at': datetime.datetime(2024, 10, 22, 16, 37, 37, 975074), 'buffer_idx': 0}, page_content='hello world')]

High decay rate

With a highdecay rate (e.g., several 9's), therecency score quickly goes to 0! If you set this all the way to 1,recency is 0 for all objects, once again making this equivalent to a vector lookup.

# Define your embedding model
embeddings_model= OpenAIEmbeddings()
# Initialize the vectorstore as empty
embedding_size=1536
index= faiss.IndexFlatL2(embedding_size)
vectorstore= FAISS(embeddings_model, index, InMemoryDocstore({}),{})
retriever= TimeWeightedVectorStoreRetriever(
vectorstore=vectorstore, decay_rate=0.999, k=1
)
yesterday= datetime.now()- timedelta(days=1)
retriever.add_documents(
[Document(page_content="hello world", metadata={"last_accessed_at": yesterday})]
)
retriever.add_documents([Document(page_content="hello foo")])
['379631f0-42c2-4773-8cc2-d36201e1e610']
# "Hello Foo" is returned first because "hello world" is mostly forgotten
retriever.invoke("hello world")
[Document(metadata={'last_accessed_at': datetime.datetime(2024, 10, 22, 16, 37, 46, 553633), 'created_at': datetime.datetime(2024, 10, 22, 16, 37, 43, 927429), 'buffer_idx': 1}, page_content='hello foo')]

Virtual time

Using some utils in LangChain, you can mock out the time component.

from langchain_core.utilsimport mock_now
API Reference:mock_now
# Notice the last access time is that date time

tomorrow= datetime.now()+ timedelta(days=1)

with mock_now(tomorrow):
print(retriever.invoke("hello world"))
[Document(metadata={'last_accessed_at': MockDateTime(2024, 10, 23, 16, 38, 19, 66711), 'created_at': datetime.datetime(2024, 10, 22, 16, 37, 43, 599877), 'buffer_idx': 0}, page_content='hello world')]

[8]ページ先頭

©2009-2025 Movatter.jp