Migrating from StuffDocumentsChain
StuffDocumentsChain combines documents by concatenating them into a single context window. It is a straightforward and effective strategy for combining documents for question-answering, summarization, and other purposes.
create_stuff_documents_chain is the recommended alternative. It functions the same asStuffDocumentsChain
, with better support for streaming and batch functionality. Because it is a simple combination ofLCEL primitives, it is also easier to extend and incorporate into other LangChain applications.
Below we will go through bothStuffDocumentsChain
andcreate_stuff_documents_chain
on a simple example for illustrative purposes.
Let's first load a chat model:
pip install -qU "langchain[google-genai]"
import getpass
import os
ifnot os.environ.get("GOOGLE_API_KEY"):
os.environ["GOOGLE_API_KEY"]= getpass.getpass("Enter API key for Google Gemini: ")
from langchain.chat_modelsimport init_chat_model
llm= init_chat_model("gemini-2.0-flash", model_provider="google_genai")
Example
Let's go through an example where we analyze a set of documents. We first generate some simple documents for illustrative purposes:
from langchain_core.documentsimport Document
documents=[
Document(page_content="Apples are red", metadata={"title":"apple_book"}),
Document(page_content="Blueberries are blue", metadata={"title":"blueberry_book"}),
Document(page_content="Bananas are yelow", metadata={"title":"banana_book"}),
]
Legacy
Details
Below we show an implementation withStuffDocumentsChain
. We define the prompt template for a summarization task and instantiate aLLMChain object for this purpose. We define how documents are formatted into the prompt and ensure consistency among the keys in the various prompts.
from langchain.chainsimport LLMChain, StuffDocumentsChain
from langchain_core.promptsimport ChatPromptTemplate, PromptTemplate
# This controls how each document will be formatted. Specifically,
# it will be passed to `format_document` - see that function for more
# details.
document_prompt= PromptTemplate(
input_variables=["page_content"], template="{page_content}"
)
document_variable_name="context"
# The prompt here should take as an input variable the
# `document_variable_name`
prompt= ChatPromptTemplate.from_template("Summarize this content: {context}")
llm_chain= LLMChain(llm=llm, prompt=prompt)
chain= StuffDocumentsChain(
llm_chain=llm_chain,
document_prompt=document_prompt,
document_variable_name=document_variable_name,
)
We can now invoke our chain:
result= chain.invoke(documents)
result["output_text"]
'This content describes the colors of different fruits: apples are red, blueberries are blue, and bananas are yellow.'
for chunkin chain.stream(documents):
print(chunk)
{'input_documents': [Document(metadata={'title': 'apple_book'}, page_content='Apples are red'), Document(metadata={'title': 'blueberry_book'}, page_content='Blueberries are blue'), Document(metadata={'title': 'banana_book'}, page_content='Bananas are yelow')], 'output_text': 'This content describes the colors of different fruits: apples are red, blueberries are blue, and bananas are yellow.'}
LCEL
Details
Below we show an implementation usingcreate_stuff_documents_chain
:
from langchain.chains.combine_documentsimport create_stuff_documents_chain
from langchain_core.promptsimport ChatPromptTemplate
prompt= ChatPromptTemplate.from_template("Summarize this content: {context}")
chain= create_stuff_documents_chain(llm, prompt)
Invoking the chain, we obtain a similar result as before:
result= chain.invoke({"context": documents})
result
'This content describes the colors of different fruits: apples are red, blueberries are blue, and bananas are yellow.'
Note that this implementation supports streaming of output tokens:
for chunkin chain.stream({"context": documents}):
print(chunk, end=" | ")
| This | content | describes | the | colors | of | different | fruits | : | apples | are | red | , | blue | berries | are | blue | , | and | bananas | are | yellow | . | |
Next steps
Check out theLCEL conceptual docs for more background information.
See thesehow-to guides for more on question-answering tasks with RAG.
Seethis tutorial for more LLM-based summarization strategies.