ChatXAI#

classlangchain_xai.chat_models.ChatXAI[source]#

Bases:BaseChatOpenAI

ChatXAI chat model.

Refer toxAI’s documentationfor more nuanced details on the API’s behavior and supported parameters.

Setup:

Installlangchain-xai and set environment variableXAI_API_KEY.

pipinstall-Ulangchain-xaiexportXAI_API_KEY="your-api-key"
Key init args — completion params:
model: str

Name of model to use.

temperature: float

Sampling temperature between0 and2. Higher values mean more random completions,while lower values (like0.2) mean more focused and deterministic completions.(Default:1.)

max_tokens: Optional[int]

Max number of tokens to generate. Refer to yourmodel’s documentationfor the maximum number of tokens it can generate.

logprobs: Optional[bool]

Whether to return logprobs.

Key init args — client params:
timeout: Union[float, Tuple[float, float], Any, None]

Timeout for requests.

max_retries: int

Max number of retries.

api_key: Optional[str]

xAI API key. If not passed in will be read from env varXAI_API_KEY.

Instantiate:
fromlangchain_xaiimportChatXAIllm=ChatXAI(model="grok-4",temperature=0,max_tokens=None,timeout=None,max_retries=2,# api_key="...",# other params...)
Invoke:
messages=[("system","You are a helpful translator. Translate the user sentence to French.",),("human","I love programming."),]llm.invoke(messages)
AIMessage(content="J'adore la programmation.",response_metadata={'token_usage':{'completion_tokens':9,'prompt_tokens':32,'total_tokens':41},'model_name':'grok-4','system_fingerprint':None,'finish_reason':'stop','logprobs':None},id='run-168dceca-3b8b-4283-94e3-4c739dbc1525-0',usage_metadata={'input_tokens':32,'output_tokens':9,'total_tokens':41})
Stream:
forchunkinllm.stream(messages):print(chunk.text(),end="")
content='J'id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'content="'"id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'content='ad'id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'content='ore'id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'content=' la'id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'content=' programm'id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'content='ation'id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'content='.'id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'content=''response_metadata={'finish_reason':'stop','model_name':'grok-4'}id='run-1bc996b5-293f-4114-96a1-e0f755c05eb9'
Async:
awaitllm.ainvoke(messages)# stream:# async for chunk in (await llm.astream(messages))# batch:# await llm.abatch([messages])
AIMessage(content="J'adore la programmation.",response_metadata={'token_usage':{'completion_tokens':9,'prompt_tokens':32,'total_tokens':41},'model_name':'grok-4','system_fingerprint':None,'finish_reason':'stop','logprobs':None},id='run-09371a11-7f72-4c53-8e7c-9de5c238b34c-0',usage_metadata={'input_tokens':32,'output_tokens':9,'total_tokens':41})
Reasoning:

Certain xAI models support reasoning,which allows the model to provide reasoning content along with the response.

If provided, reasoning content is returned under theadditional_kwargs field of theAIMessage or AIMessageChunk.

If supported, reasoning effort can be specified in the model constructor’sextra_bodyargument, which will control the amount of reasoning the model does. The value can be one of'low' or'high'.

model=ChatXAI(model="grok-3-mini",extra_body={"reasoning_effort":"high"},)

Note

As of 2025-07-10,reasoning_content is only returned in Grok 3 models, such asGrok 3 Mini.

Note

Note that inGrok 4, as of 2025-07-10,reasoning is not exposed inreasoning_content (other than initial'Thinking...' text),reasoning cannot be disabled, and thereasoning_effort cannot be specified.

Tool calling / function calling:
frompydanticimportBaseModel,Fieldllm=ChatXAI(model="grok-4")classGetWeather(BaseModel):'''Get the current weather in a given location'''location:str=Field(...,description="The city and state, e.g. San Francisco, CA")classGetPopulation(BaseModel):'''Get the current population in a given location'''location:str=Field(...,description="The city and state, e.g. San Francisco, CA")llm_with_tools=llm.bind_tools([GetWeather,GetPopulation])ai_msg=llm_with_tools.invoke("Which city is bigger: LA or NY?")ai_msg.tool_calls
[{'name':'GetPopulation','args':{'location':'NY'},'id':'call_m5tstyn2004pre9bfuxvom8x','type':'tool_call'},{'name':'GetPopulation','args':{'location':'LA'},'id':'call_0vjgq455gq1av5sp9eb1pw6a','type':'tool_call'}]

Note

With stream response, the tool / function call will be returned in whole in asingle chunk, instead of being streamed across chunks.

Tool choice can be controlled by setting thetool_choice parameter in the modelconstructor’sextra_body argument. For example, to disable tool / function calling:.. code-block:: python

llm = ChatXAI(model=”grok-4”, extra_body={“tool_choice”: “none”})

To require that the model always calls a tool / function, settool_choice to'required':

llm=ChatXAI(model="grok-4",extra_body={"tool_choice":"required"})

To specify a tool / function to call, settool_choice to the name of the tool / function:

frompydanticimportBaseModel,Fieldllm=ChatXAI(model="grok-4",extra_body={"tool_choice":{"type":"function","function":{"name":"GetWeather"}}},)classGetWeather(BaseModel):    \"\"\"Get the current weather in a given location\"\"\"location:str=Field(...,description='The city and state, e.g. San Francisco, CA')classGetPopulation(BaseModel):    \"\"\"Get the current population in a given location\"\"\"location:str=Field(...,description='The city and state, e.g. San Francisco, CA')llm_with_tools=llm.bind_tools([GetWeather,GetPopulation])ai_msg=llm_with_tools.invoke("Which city is bigger: LA or NY?",)ai_msg.tool_calls

The resulting tool call would be:

[{'name':'GetWeather','args':{'location':'Los Angeles, CA'},'id':'call_81668711','type':'tool_call'}]
Parallel tool calling / parallel function calling:

By default, parallel tool / function calling is enabled, so you can processmultiple function calls in one request/response cycle. When two or more tool callsare required, all of the tool call requests will be included in the response body.

Structured output:
fromtypingimportOptionalfrompydanticimportBaseModel,FieldclassJoke(BaseModel):'''Joke to tell user.'''setup:str=Field(description="The setup of the joke")punchline:str=Field(description="The punchline to the joke")rating:Optional[int]=Field(description="How funny the joke is, from 1 to 10")structured_llm=llm.with_structured_output(Joke)structured_llm.invoke("Tell me a joke about cats")
Joke(setup='Why was the cat sitting on the computer?',punchline='To keep an eye on the mouse!',rating=7)
Live Search:

xAI supports aLive Searchfeature that enables Grok to ground its answers using results from web searches.

fromlangchain_xaiimportChatXAIllm=ChatXAI(model="grok-4",search_parameters={"mode":"auto",# Example optional parameters below:"max_search_results":3,"from_date":"2025-05-26","to_date":"2025-05-27",})llm.invoke("Provide me a digest of world news in the last 24 hours.")

Note

Citationsare only available inGrok 3.

Token usage:
ai_msg=llm.invoke(messages)ai_msg.usage_metadata
{'input_tokens':37,'output_tokens':6,'total_tokens':43}
Logprobs:
logprobs_llm=llm.bind(logprobs=True)messages=[("human","Say Hello World! Do not return anything else.")]ai_msg=logprobs_llm.invoke(messages)ai_msg.response_metadata["logprobs"]
{'content':None,'token_ids':[22557,3304,28808,2],'tokens':[' Hello',' World','!','</s>'],'token_logprobs':[-4.7683716e-06,-5.9604645e-07,0,-0.057373047]}
Response metadata
ai_msg=llm.invoke(messages)ai_msg.response_metadata
{'token_usage':{'completion_tokens':4,'prompt_tokens':19,'total_tokens':23},'model_name':'grok-4','system_fingerprint':None,'finish_reason':'stop','logprobs':None}

Note

ChatXAI implements the standardRunnableInterface. 🏃

TheRunnableInterface has additional methods that are available on runnables, such aswith_config,with_types,with_retry,assign,bind,get_graph, and more.

paramcache:BaseCache|bool|None=None#

Whether to cache the response.

  • If true, will use the global cache.

  • If false, will not use a cache

  • If None, will use the global cache if it’s set, otherwise no cache.

  • If instance of BaseCache, will use the provided cache.

Caching is not currently supported for streaming methods of models.

paramcallback_manager:BaseCallbackManager|None=None#

Deprecated since version 0.1.7:Usecallbacks() instead. It will be removed in pydantic==1.0.

Callback manager to add to the run trace.

paramcallbacks:Callbacks=None#

Callbacks to add to the run trace.

paramcustom_get_token_ids:Callable[[str],list[int]]|None=None#

Optional encoder to use for counting tokens.

paramdefault_headers:Mapping[str,str]|None=None#
paramdefault_query:Mapping[str,object]|None=None#
paramdisable_streaming:bool|Literal['tool_calling']=False#

Whether to disable streaming for this model.

If streaming is bypassed, thenstream()/astream()/astream_events() willdefer toinvoke()/ainvoke().

  • If True, will always bypass streaming case.

  • If'tool_calling', will bypass streaming case only when the model is calledwith atools keyword argument. In other words, LangChain will automaticallyswitch to non-streaming behavior (invoke()) only when the tools argument isprovided. This offers the best of both worlds.

  • If False (default), will always use streaming case if available.

The main reason for this flag is that code might be written using.stream() anda user may want to swap out a given model for another model whose the implementationdoes not properly support streaming.

paramdisabled_params:dict[str,Any]|None=None#

Parameters of the OpenAI client or chat.completions endpoint that should bedisabled for the given model.

Should be specified as{"param":None|['val1','val2']} where the key is theparameter and the value is either None, meaning that parameter should never beused, or it’s a list of disabled values for the parameter.

For example, older models may not support the ‘parallel_tool_calls’ parameter atall, in which casedisabled_params={"parallel_tool_calls":None} can be passedin.

If a parameter is disabled then it will not be used by default in any methods, e.g.inwith_structured_output().However this does not prevent a user from directly passed in the parameter duringinvocation.

paramextra_body:Mapping[str,Any]|None=None#

Optional additional JSON properties to include in the request parameters whenmaking requests to OpenAI compatible APIs, such as vLLM.

paramfrequency_penalty:float|None=None#

Penalizes repeated tokens according to frequency.

paramhttp_async_client:Any|None=None#

Optional httpx.AsyncClient. Only used for async invocations. Must specifyhttp_client as well if you’d like a custom client for sync invocations.

paramhttp_client:Any|None=None#

Optionalhttpx.Client. Only used for sync invocations. Must specifyhttp_async_client as well if you’d like a custom client for asyncinvocations.

paraminclude:list[str]|None=None#

Additional fields to include in generations from Responses API.

Supported values:

  • "file_search_call.results"

  • "message.input_image.image_url"

  • "computer_call_output.output.image_url"

  • "reasoning.encrypted_content"

  • "code_interpreter_call.outputs"

Added in version 0.3.24.

paraminclude_response_headers:bool=False#

Whether to include response headers in the output message response_metadata.

paramlogit_bias:dict[int,int]|None=None#

Modify the likelihood of specified tokens appearing in the completion.

paramlogprobs:bool|None=None#

Whether to return logprobs.

parammax_retries:int|None=None#

Maximum number of retries to make when generating.

parammax_tokens:int|None=None#

Maximum number of tokens to generate.

parammetadata:dict[str,Any]|None=None#

Metadata to add to the run trace.

parammodel_kwargs:dict[str,Any][Optional]#

Holds any model parameters valid forcreate call not explicitly specified.

parammodel_name:str='grok-4'(alias'model')#

Model name to use.

paramn:int|None=None#

Number of chat completions to generate for each prompt.

paramopenai_api_base:str|None=None#

Base URL path for API requests, leave blank if not using a proxy or serviceemulator.

paramopenai_api_key:SecretStr|None=None#
paramopenai_organization:str|None=None(alias'organization')#

Automatically inferred from env varOPENAI_ORG_ID if not provided.

paramopenai_proxy:str|None[Optional]#
paramoutput_version:Literal['v0','responses/v1']='v0'#

Version of AIMessage output format to use.

This field is used to roll-out new output formats for chat model AIMessagesin a backwards-compatible way.

Supported values:

  • "v0": AIMessage format as of langchain-openai 0.3.x.

  • "responses/v1": Formats Responses API outputitems into AIMessage content blocks.

Currently only impacts the Responses API.output_version="responses/v1" isrecommended.

Added in version 0.3.25.

parampresence_penalty:float|None=None#

Penalizes repeated tokens.

paramrate_limiter:BaseRateLimiter|None=None#

An optional rate limiter to use for limiting the number of requests.

paramreasoning:dict[str,Any]|None=None#

Reasoning parameters for reasoning models, i.e., OpenAI o-series models (o1, o3,o4-mini, etc.). For use with the Responses API.

Example:

reasoning={"effort":"medium",# can be "low", "medium", or "high""summary":"auto",# can be "auto", "concise", or "detailed"}

Added in version 0.3.24.

paramreasoning_effort:str|None=None#

Constrains effort on reasoning for reasoning models. For use with the ChatCompletions API.

Reasoning models only, like OpenAI o1, o3, and o4-mini.

Currently supported values are low, medium, and high. Reducing reasoning effortcan result in faster responses and fewer tokens used on reasoning in a response.

Added in version 0.2.14.

paramrequest_timeout:float|tuple[float,float]|Any|None=None(alias'timeout')#

Timeout for requests to OpenAI completion API. Can be float, httpx.Timeout orNone.

paramsearch_parameters:dict[str,Any]|None=None#

Parameters for search requests. Example:{"mode":"auto"}.

paramseed:int|None=None#

Seed for generation

paramservice_tier:str|None=None#

Latency tier for request. Options are'auto','default', or'flex'.Relevant for users of OpenAI’s scale tier service.

paramstop:list[str]|str|None=None(alias'stop_sequences')#

Default stop sequences.

paramstore:bool|None=None#

If True, OpenAI may store response data for future use. Defaults to Truefor the Responses API and False for the Chat Completions API.

Added in version 0.3.24.

paramstream_usage:bool=False#

Whether to include usage metadata in streaming output. If True, an additionalmessage chunk will be generated during the stream including usage metadata.

Added in version 0.3.9.

paramstreaming:bool=False#

Whether to stream the results or not.

paramtags:list[str]|None=None#

Tags to add to the run trace.

paramtemperature:float|None=None#

What sampling temperature to use.

paramtiktoken_model_name:str|None=None#

The model name to pass to tiktoken when using this class.Tiktoken is used to count the number of tokens in documents to constrainthem to be under a certain limit. By default, when set to None, this willbe the same as the embedding model name. However, there are some caseswhere you may want to use this Embedding class with a model name notsupported by tiktoken. This can include when using Azure embeddings orwhen using one of the many model providers that expose an OpenAI-likeAPI but with different models. In those cases, in order to avoid erroringwhen tiktoken is called, you can specify a model name to use here.

paramtop_logprobs:int|None=None#

Number of most likely tokens to return at each token position, each withan associated log probability.logprobs must be set to trueif this parameter is used.

paramtop_p:float|None=None#

Total probability mass of tokens to consider at each step.

paramtruncation:str|None=None#

Truncation strategy (Responses API). Can be'auto' or'disabled'(default). If'auto', model may drop input items from the middle of themessage sequence to fit the context window.

Added in version 0.3.24.

paramuse_previous_response_id:bool=False#

If True, always passprevious_response_id using the ID of the most recentresponse. Responses API only.

Input messages up to the most recent response will be dropped from requestpayloads.

For example, the following two are equivalent:

llm=ChatOpenAI(model="o4-mini",use_previous_response_id=True,)llm.invoke([HumanMessage("Hello"),AIMessage("Hi there!",response_metadata={"id":"resp_123"}),HumanMessage("How are you?"),])
llm=ChatOpenAI(model="o4-mini",use_responses_api=True,)llm.invoke([HumanMessage("How are you?")],previous_response_id="resp_123")

Added in version 0.3.26.

paramuse_responses_api:bool|None=None#

Whether to use the Responses API instead of the Chat API.

If not specified then will be inferred based on invocation params.

Added in version 0.3.9.

paramverbose:bool[Optional]#

Whether to print out response text.

paramxai_api_base:str='https://api.x.ai/v1/'#

Base URL path for API requests.

paramxai_api_key:SecretStr|None[Optional](alias'api_key')#

xAI API key.

Automatically read from env variableXAI_API_KEY if not provided.

__call__(
messages:list[BaseMessage],
stop:list[str]|None=None,
callbacks:list[BaseCallbackHandler]|BaseCallbackManager|None=None,
**kwargs:Any,
)BaseMessage#

Deprecated since version 0.1.7:Useinvoke() instead. It will not be removed until langchain-core==1.0.

Call the model.

Parameters:
  • messages (list[BaseMessage]) – List of messages.

  • stop (list[str]|None) – Stop words to use when generating. Model output is cut off at thefirst occurrence of any of these substrings.

  • callbacks (list[BaseCallbackHandler]|BaseCallbackManager |None) – Callbacks to pass through. Used for executing additionalfunctionality, such as logging or streaming, throughout generation.

  • **kwargs (Any) – Arbitrary additional keyword arguments. These are usually passedto the model provider API call.

Returns:

The model output message.

Return type:

BaseMessage

asyncabatch(
inputs:list[Input],
config:RunnableConfig|list[RunnableConfig]|None=None,
*,
return_exceptions:bool=False,
**kwargs:Any|None,
)list[Output]#

Default implementation runs ainvoke in parallel using asyncio.gather.

The default implementation of batch works well for IO bound runnables.

Subclasses should override this method if they can batch more efficiently;e.g., if the underlying Runnable uses an API which supports a batch mode.

Parameters:
  • inputs (list[Input]) – A list of inputs to the Runnable.

  • config (RunnableConfig |list[RunnableConfig]|None) – A config to use when invoking the Runnable.The config supports standard keys like ‘tags’, ‘metadata’ for tracingpurposes, ‘max_concurrency’ for controlling how much work to doin parallel, and other keys. Please refer to the RunnableConfigfor more details. Defaults to None.

  • return_exceptions (bool) – Whether to return exceptions instead of raising them.Defaults to False.

  • kwargs (Any |None) – Additional keyword arguments to pass to the Runnable.

Returns:

A list of outputs from the Runnable.

Return type:

list[Output]

asyncabatch_as_completed(
inputs:Sequence[Input],
config:RunnableConfig|Sequence[RunnableConfig]|None=None,
*,
return_exceptions:bool=False,
**kwargs:Any|None,
)AsyncIterator[tuple[int,Output|Exception]]#

Run ainvoke in parallel on a list of inputs.

Yields results as they complete.

Parameters:
  • inputs (Sequence[Input]) – A list of inputs to the Runnable.

  • config (RunnableConfig |Sequence[RunnableConfig]|None) – A config to use when invoking the Runnable.The config supports standard keys like ‘tags’, ‘metadata’ for tracingpurposes, ‘max_concurrency’ for controlling how much work to doin parallel, and other keys. Please refer to the RunnableConfigfor more details. Defaults to None. Defaults to None.

  • return_exceptions (bool) – Whether to return exceptions instead of raising them.Defaults to False.

  • kwargs (Any |None) – Additional keyword arguments to pass to the Runnable.

Yields:

A tuple of the index of the input and the output from the Runnable.

Return type:

AsyncIterator[tuple[int,Output | Exception]]

asyncainvoke(
input:LanguageModelInput,
config:RunnableConfig|None=None,
*,
stop:list[str]|None=None,
**kwargs:Any,
)BaseMessage#

Default implementation of ainvoke, calls invoke from a thread.

The default implementation allows usage of async code even ifthe Runnable did not implement a native async version of invoke.

Subclasses should override this method if they can run asynchronously.

Parameters:
  • input (LanguageModelInput)

  • config (Optional[RunnableConfig])

  • stop (Optional[list[str]])

  • kwargs (Any)

Return type:

BaseMessage

asyncastream(
input:LanguageModelInput,
config:RunnableConfig|None=None,
*,
stop:list[str]|None=None,
**kwargs:Any,
)AsyncIterator[BaseMessageChunk]#

Default implementation of astream, which calls ainvoke.

Subclasses should override this method if they support streaming output.

Parameters:
  • input (LanguageModelInput) – The input to the Runnable.

  • config (Optional[RunnableConfig]) – The config to use for the Runnable. Defaults to None.

  • kwargs (Any) – Additional keyword arguments to pass to the Runnable.

  • stop (Optional[list[str]])

Yields:

The output of the Runnable.

Return type:

AsyncIterator[BaseMessageChunk]

asyncastream_events(
input:Any,
config:RunnableConfig|None=None,
*,
version:Literal['v1','v2']='v2',
include_names:Sequence[str]|None=None,
include_types:Sequence[str]|None=None,
include_tags:Sequence[str]|None=None,
exclude_names:Sequence[str]|None=None,
exclude_types:Sequence[str]|None=None,
exclude_tags:Sequence[str]|None=None,
**kwargs:Any,
)AsyncIterator[StreamEvent]#

Generate a stream of events.

Use to create an iterator over StreamEvents that provide real-time informationabout the progress of the Runnable, including StreamEvents from intermediateresults.

A StreamEvent is a dictionary with the following schema:

  • event:str - Event names are of the format:on_[runnable_type]_(start|stream|end).

  • name:str - The name of the Runnable that generated the event.

  • run_id:str - randomly generated ID associated with the givenexecution of the Runnable that emitted the event. A child Runnable that getsinvoked as part of the execution of a parent Runnable is assigned its ownunique ID.

  • parent_ids:list[str] - The IDs of the parent runnables that generatedthe event. The root Runnable will have an empty list. The order of the parentIDs is from the root to the immediate parent. Only available for v2 version ofthe API. The v1 version of the API will return an empty list.

  • tags:Optional[list[str]] - The tags of the Runnable that generatedthe event.

  • metadata:Optional[dict[str, Any]] - The metadata of the Runnable thatgenerated the event.

  • data:dict[str, Any]

Below is a table that illustrates some events that might be emitted by variouschains. Metadata fields have been omitted from the table for brevity.Chain definitions have been included after the table.

Note

This reference table is for the V2 version of the schema.

event

name

chunk

input

output

on_chat_model_start

[model name]

{“messages”: [[SystemMessage, HumanMessage]]}

on_chat_model_stream

[model name]

AIMessageChunk(content=”hello”)

on_chat_model_end

[model name]

{“messages”: [[SystemMessage, HumanMessage]]}

AIMessageChunk(content=”hello world”)

on_llm_start

[model name]

{‘input’: ‘hello’}

on_llm_stream

[model name]

‘Hello’

on_llm_end

[model name]

‘Hello human!’

on_chain_start

format_docs

on_chain_stream

format_docs

“hello world!, goodbye world!”

on_chain_end

format_docs

[Document(…)]

“hello world!, goodbye world!”

on_tool_start

some_tool

{“x”: 1, “y”: “2”}

on_tool_end

some_tool

{“x”: 1, “y”: “2”}

on_retriever_start

[retriever name]

{“query”: “hello”}

on_retriever_end

[retriever name]

{“query”: “hello”}

[Document(…), ..]

on_prompt_start

[template_name]

{“question”: “hello”}

on_prompt_end

[template_name]

{“question”: “hello”}

ChatPromptValue(messages: [SystemMessage, …])

In addition to the standard events, users can also dispatch custom events (see example below).

Custom events will be only be surfaced with in thev2 version of the API!

A custom event has following format:

Attribute

Type

Description

name

str

A user defined name for the event.

data

Any

The data associated with the event. This can be anything, though we suggest making it JSON serializable.

Here are declarations associated with the standard events shown above:

format_docs:

defformat_docs(docs:list[Document])->str:'''Format the docs.'''return", ".join([doc.page_contentfordocindocs])format_docs=RunnableLambda(format_docs)

some_tool:

@tooldefsome_tool(x:int,y:str)->dict:'''Some_tool.'''return{"x":x,"y":y}

prompt:

template=ChatPromptTemplate.from_messages([("system","You are Cat Agent 007"),("human","{question}")]).with_config({"run_name":"my_template","tags":["my_template"]})

Example:

fromlangchain_core.runnablesimportRunnableLambdaasyncdefreverse(s:str)->str:returns[::-1]chain=RunnableLambda(func=reverse)events=[eventasyncforeventinchain.astream_events("hello",version="v2")]# will produce the following events (run_id, and parent_ids# has been omitted for brevity):[{"data":{"input":"hello"},"event":"on_chain_start","metadata":{},"name":"reverse","tags":[],},{"data":{"chunk":"olleh"},"event":"on_chain_stream","metadata":{},"name":"reverse","tags":[],},{"data":{"output":"olleh"},"event":"on_chain_end","metadata":{},"name":"reverse","tags":[],},]

Example: Dispatch Custom Event

fromlangchain_core.callbacks.managerimport(adispatch_custom_event,)fromlangchain_core.runnablesimportRunnableLambda,RunnableConfigimportasyncioasyncdefslow_thing(some_input:str,config:RunnableConfig)->str:"""Do something that takes a long time."""awaitasyncio.sleep(1)# Placeholder for some slow operationawaitadispatch_custom_event("progress_event",{"message":"Finished step 1 of 3"},config=config# Must be included for python < 3.10)awaitasyncio.sleep(1)# Placeholder for some slow operationawaitadispatch_custom_event("progress_event",{"message":"Finished step 2 of 3"},config=config# Must be included for python < 3.10)awaitasyncio.sleep(1)# Placeholder for some slow operationreturn"Done"slow_thing=RunnableLambda(slow_thing)asyncforeventinslow_thing.astream_events("some_input",version="v2"):print(event)
Parameters:
  • input (Any) – The input to the Runnable.

  • config (Optional[RunnableConfig]) – The config to use for the Runnable.

  • version (Literal['v1','v2']) – The version of the schema to use eitherv2 orv1.Users should usev2.v1 is for backwards compatibility and will be deprecatedin 0.4.0.No default will be assigned until the API is stabilized.custom events will only be surfaced inv2.

  • include_names (Optional[Sequence[str]]) – Only include events from runnables with matching names.

  • include_types (Optional[Sequence[str]]) – Only include events from runnables with matching types.

  • include_tags (Optional[Sequence[str]]) – Only include events from runnables with matching tags.

  • exclude_names (Optional[Sequence[str]]) – Exclude events from runnables with matching names.

  • exclude_types (Optional[Sequence[str]]) – Exclude events from runnables with matching types.

  • exclude_tags (Optional[Sequence[str]]) – Exclude events from runnables with matching tags.

  • kwargs (Any) – Additional keyword arguments to pass to the Runnable.These will be passed to astream_log as this implementationof astream_events is built on top of astream_log.

Yields:

An async stream of StreamEvents.

Raises:

NotImplementedError – If the version is notv1 orv2.

Return type:

AsyncIterator[StreamEvent]

batch(
inputs:list[Input],
config:RunnableConfig|list[RunnableConfig]|None=None,
*,
return_exceptions:bool=False,
**kwargs:Any|None,
)list[Output]#

Default implementation runs invoke in parallel using a thread pool executor.

The default implementation of batch works well for IO bound runnables.

Subclasses should override this method if they can batch more efficiently;e.g., if the underlying Runnable uses an API which supports a batch mode.

Parameters:
Return type:

list[Output]

batch_as_completed(
inputs:Sequence[Input],
config:RunnableConfig|Sequence[RunnableConfig]|None=None,
*,
return_exceptions:bool=False,
**kwargs:Any|None,
)Iterator[tuple[int,Output|Exception]]#

Run invoke in parallel on a list of inputs.

Yields results as they complete.

Parameters:
Return type:

Iterator[tuple[int,Output | Exception]]

bind(
**kwargs:Any,
)Runnable[Input,Output]#

Bind arguments to a Runnable, returning a new Runnable.

Useful when a Runnable in a chain requires an argument that is notin the output of the previous Runnable or included in the user input.

Parameters:

kwargs (Any) – The arguments to bind to the Runnable.

Returns:

A new Runnable with the arguments bound.

Return type:

Runnable[Input,Output]

Example:

fromlangchain_ollamaimportChatOllamafromlangchain_core.output_parsersimportStrOutputParserllm=ChatOllama(model='llama2')# Without bind.chain=(llm|StrOutputParser())chain.invoke("Repeat quoted words exactly: 'One two three four five.'")# Output is 'One two three four five.'# With bind.chain=(llm.bind(stop=["three"])|StrOutputParser())chain.invoke("Repeat quoted words exactly: 'One two three four five.'")# Output is 'One two'
bind_functions(
functions:Sequence[dict[str,Any]|type[BaseModel]|Callable|BaseTool],
function_call:_FunctionCall|str|Literal['auto','none']|None=None,
**kwargs:Any,
)Runnable[PromptValue|str|Sequence[BaseMessage|list[str]|tuple[str,str]|str|dict[str,Any]],BaseMessage]#

Deprecated since version 0.2.1:Usebind_tools() instead. It will not be removed until langchain-openai==1.0.0.

Bind functions (and other objects) to this chat model.

Assumes model is compatible with OpenAI function-calling API.

NOTE: Using bind_tools is recommended instead, as thefunctions and

function_call request parameters are officially marked as deprecated byOpenAI.

Parameters:
  • functions (Sequence[dict[str,Any]|type[BaseModel]|Callable |BaseTool]) – A list of function definitions to bind to this chat model.Can be a dictionary, pydantic model, or callable. Pydanticmodels and callables will be automatically converted totheir schema dictionary representation.

  • function_call (_FunctionCall |str |Literal['auto','none']|None) – Which function to require the model to call.Must be the name of the single provided function or“auto” to automatically determine which function to call(if any).

  • **kwargs (Any) – Any additional parameters to pass to theRunnable constructor.

Return type:

Runnable[PromptValue | str |Sequence[BaseMessage | list[str] | tuple[str, str] | str | dict[str,Any]],BaseMessage]

bind_tools(
tools:Sequence[dict[str,Any]|type|Callable|BaseTool],
*,
tool_choice:dict|str|Literal['auto','none','required','any']|bool|None=None,
strict:bool|None=None,
parallel_tool_calls:bool|None=None,
**kwargs:Any,
)Runnable[PromptValue|str|Sequence[BaseMessage|list[str]|tuple[str,str]|str|dict[str,Any]],BaseMessage]#

Bind tool-like objects to this chat model.

Assumes model is compatible with OpenAI tool-calling API.

Parameters:
  • tools (Sequence[dict[str,Any]|type |Callable |BaseTool]) – A list of tool definitions to bind to this chat model.Supports any tool definition handled bylangchain_core.utils.function_calling.convert_to_openai_tool().

  • tool_choice (dict |str |Literal['auto','none','required','any']|bool |None) –

    Which tool to require the model to call. Options are:

    • str of the form"<<tool_name>>": calls <<tool_name>> tool.

    • "auto": automatically selects a tool (including no tool).

    • "none": does not call a tool.

    • "any" or"required" orTrue: force at least one tool to be called.

    • dict of the form{"type":"function","function":{"name":<<tool_name>>}}: calls <<tool_name>> tool.

    • False orNone: no effect, default OpenAI behavior.

  • strict (bool |None) – If True, model output is guaranteed to exactly match the JSON Schemaprovided in the tool definition. If True, the input schema will bevalidated according tohttps://platform.openai.com/docs/guides/structured-outputs/supported-schemas.If False, input schema will not be validated and model output will notbe validated.If None,strict argument will not be passed to the model.

  • parallel_tool_calls (bool |None) – Set toFalse to disable parallel tool use.Defaults toNone (no specification, which allows parallel tool use).

  • kwargs (Any) – Any additional parameters are passed directly tobind().

Return type:

Runnable[PromptValue | str |Sequence[BaseMessage | list[str] | tuple[str, str] | str | dict[str,Any]],BaseMessage]

Changed in version 0.1.21:Support forstrict argument added.

configurable_alternatives(
which:ConfigurableField,
*,
default_key:str='default',
prefix_keys:bool=False,
**kwargs:Runnable[Input,Output]|Callable[[],Runnable[Input,Output]],
)RunnableSerializable#

Configure alternatives for Runnables that can be set at runtime.

Parameters:
  • which (ConfigurableField) – The ConfigurableField instance that will be used to select thealternative.

  • default_key (str) – The default key to use if no alternative is selected.Defaults to “default”.

  • prefix_keys (bool) – Whether to prefix the keys with the ConfigurableField id.Defaults to False.

  • **kwargs (Runnable[Input,Output]|Callable[[],Runnable[Input,Output]]) – A dictionary of keys to Runnable instances or callables thatreturn Runnable instances.

Returns:

A new Runnable with the alternatives configured.

Return type:

RunnableSerializable

fromlangchain_anthropicimportChatAnthropicfromlangchain_core.runnables.utilsimportConfigurableFieldfromlangchain_openaiimportChatOpenAImodel=ChatAnthropic(model_name="claude-3-sonnet-20240229").configurable_alternatives(ConfigurableField(id="llm"),default_key="anthropic",openai=ChatOpenAI())# uses the default model ChatAnthropicprint(model.invoke("which organization created you?").content)# uses ChatOpenAIprint(model.with_config(configurable={"llm":"openai"}).invoke("which organization created you?").content)
configurable_fields(
**kwargs:ConfigurableField|ConfigurableFieldSingleOption|ConfigurableFieldMultiOption,
)RunnableSerializable#

Configure particular Runnable fields at runtime.

Parameters:

**kwargs (ConfigurableField |ConfigurableFieldSingleOption |ConfigurableFieldMultiOption) – A dictionary of ConfigurableField instances to configure.

Returns:

A new Runnable with the fields configured.

Return type:

RunnableSerializable

fromlangchain_core.runnablesimportConfigurableFieldfromlangchain_openaiimportChatOpenAImodel=ChatOpenAI(max_tokens=20).configurable_fields(max_tokens=ConfigurableField(id="output_token_number",name="Max tokens in the output",description="The maximum number of tokens in the output",))# max_tokens = 20print("max_tokens_20: ",model.invoke("tell me something about chess").content)# max_tokens = 200print("max_tokens_200: ",model.with_config(configurable={"output_token_number":200}).invoke("tell me something about chess").content)
get_num_tokens(text:str)int#

Get the number of tokens present in the text.

Useful for checking if an input fits in a model’s context window.

Parameters:

text (str) – The string input to tokenize.

Returns:

The integer number of tokens in the text.

Return type:

int

get_num_tokens_from_messages(
messages:list[BaseMessage],
tools:Sequence[dict[str,Any]|type|Callable|BaseTool]|None=None,
)int#

Calculate num tokens forgpt-3.5-turbo andgpt-4 withtiktoken package.

Requirements: You must have thepillow installed if you want to countimage tokens if you are specifying the image as a base64 string, and you musthave bothpillow andhttpx installed if you are specifying the imageas a URL. If these aren’t installed image inputs will be ignored in tokencounting.

OpenAI reference

Parameters:
  • messages (list[BaseMessage]) – The message inputs to tokenize.

  • tools (Sequence[dict[str,Any]|type |Callable |BaseTool]|None) – If provided, sequence of dict, BaseModel, function, or BaseToolsto be converted to tool schemas.

Return type:

int

get_token_ids(text:str)list[int]#

Get the tokens present in the text with tiktoken package.

Parameters:

text (str)

Return type:

list[int]

invoke(
input:LanguageModelInput,
config:RunnableConfig|None=None,
*,
stop:list[str]|None=None,
**kwargs:Any,
)BaseMessage#

Transform a single input into an output.

Parameters:
  • input (LanguageModelInput) – The input to the Runnable.

  • config (Optional[RunnableConfig]) – A config to use when invoking the Runnable.The config supports standard keys like ‘tags’, ‘metadata’ for tracingpurposes, ‘max_concurrency’ for controlling how much work to doin parallel, and other keys. Please refer to the RunnableConfigfor more details.

  • stop (Optional[list[str]])

  • kwargs (Any)

Returns:

The output of the Runnable.

Return type:

BaseMessage

stream(
input:LanguageModelInput,
config:RunnableConfig|None=None,
*,
stop:list[str]|None=None,
**kwargs:Any,
)Iterator[BaseMessageChunk]#

Default implementation of stream, which calls invoke.

Subclasses should override this method if they support streaming output.

Parameters:
  • input (LanguageModelInput) – The input to the Runnable.

  • config (Optional[RunnableConfig]) – The config to use for the Runnable. Defaults to None.

  • kwargs (Any) – Additional keyword arguments to pass to the Runnable.

  • stop (Optional[list[str]])

Yields:

The output of the Runnable.

Return type:

Iterator[BaseMessageChunk]

with_alisteners(
*,
on_start:AsyncListener|None=None,
on_end:AsyncListener|None=None,
on_error:AsyncListener|None=None,
)Runnable[Input,Output]#

Bind async lifecycle listeners to a Runnable, returning a new Runnable.

on_start: Asynchronously called before the Runnable starts running.on_end: Asynchronously called after the Runnable finishes running.on_error: Asynchronously called if the Runnable throws an error.

The Run object contains information about the run, including its id,type, input, output, error, start_time, end_time, and any tags or metadataadded to the run.

Parameters:
  • on_start (Optional[AsyncListener]) – Asynchronously called before the Runnable starts running.Defaults to None.

  • on_end (Optional[AsyncListener]) – Asynchronously called after the Runnable finishes running.Defaults to None.

  • on_error (Optional[AsyncListener]) – Asynchronously called if the Runnable throws an error.Defaults to None.

Returns:

A new Runnable with the listeners bound.

Return type:

Runnable[Input, Output]

Example:

fromlangchain_core.runnablesimportRunnableLambda,Runnablefromdatetimeimportdatetime,timezoneimporttimeimportasynciodefformat_t(timestamp:float)->str:returndatetime.fromtimestamp(timestamp,tz=timezone.utc).isoformat()asyncdeftest_runnable(time_to_sleep:int):print(f"Runnable[{time_to_sleep}s]: starts at{format_t(time.time())}")awaitasyncio.sleep(time_to_sleep)print(f"Runnable[{time_to_sleep}s]: ends at{format_t(time.time())}")asyncdeffn_start(run_obj:Runnable):print(f"on start callback starts at{format_t(time.time())}")awaitasyncio.sleep(3)print(f"on start callback ends at{format_t(time.time())}")asyncdeffn_end(run_obj:Runnable):print(f"on end callback starts at{format_t(time.time())}")awaitasyncio.sleep(2)print(f"on end callback ends at{format_t(time.time())}")runnable=RunnableLambda(test_runnable).with_alisteners(on_start=fn_start,on_end=fn_end)asyncdefconcurrent_runs():awaitasyncio.gather(runnable.ainvoke(2),runnable.ainvoke(3))asyncio.run(concurrent_runs())Result:onstartcallbackstartsat2025-03-01T07:05:22.875378+00:00onstartcallbackstartsat2025-03-01T07:05:22.875495+00:00onstartcallbackendsat2025-03-01T07:05:25.878862+00:00onstartcallbackendsat2025-03-01T07:05:25.878947+00:00Runnable[2s]:startsat2025-03-01T07:05:25.879392+00:00Runnable[3s]:startsat2025-03-01T07:05:25.879804+00:00Runnable[2s]:endsat2025-03-01T07:05:27.881998+00:00onendcallbackstartsat2025-03-01T07:05:27.882360+00:00Runnable[3s]:endsat2025-03-01T07:05:28.881737+00:00onendcallbackstartsat2025-03-01T07:05:28.882428+00:00onendcallbackendsat2025-03-01T07:05:29.883893+00:00onendcallbackendsat2025-03-01T07:05:30.884831+00:00
with_config(
config:RunnableConfig|None=None,
**kwargs:Any,
)Runnable[Input,Output]#

Bind config to a Runnable, returning a new Runnable.

Parameters:
  • config (RunnableConfig |None) – The config to bind to the Runnable.

  • kwargs (Any) – Additional keyword arguments to pass to the Runnable.

Returns:

A new Runnable with the config bound.

Return type:

Runnable[Input,Output]

with_fallbacks(fallbacks:Sequence[Runnable[Input,Output]],*,exceptions_to_handle:tuple[type[BaseException],...]=(<class'Exception'>,),exception_key:Optional[str]=None)RunnableWithFallbacksT[Input,Output]#

Add fallbacks to a Runnable, returning a new Runnable.

The new Runnable will try the original Runnable, and then each fallbackin order, upon failures.

Parameters:
  • fallbacks (Sequence[Runnable[Input,Output]]) – A sequence of runnables to try if the original Runnable fails.

  • exceptions_to_handle (tuple[type[BaseException],...]) – A tuple of exception types to handle.Defaults to (Exception,).

  • exception_key (Optional[str]) – If string is specified then handled exceptions will be passedto fallbacks as part of the input under the specified key. If None,exceptions will not be passed to fallbacks. If used, the base Runnableand its fallbacks must accept a dictionary as input. Defaults to None.

Returns:

A new Runnable that will try the original Runnable, and then eachfallback in order, upon failures.

Return type:

RunnableWithFallbacksT[Input, Output]

Example

fromtypingimportIteratorfromlangchain_core.runnablesimportRunnableGeneratordef_generate_immediate_error(input:Iterator)->Iterator[str]:raiseValueError()yield""def_generate(input:Iterator)->Iterator[str]:yield from"foo bar"runnable=RunnableGenerator(_generate_immediate_error).with_fallbacks([RunnableGenerator(_generate)])print(''.join(runnable.stream({})))#foo bar
Parameters:
  • fallbacks (Sequence[Runnable[Input,Output]]) – A sequence of runnables to try if the original Runnable fails.

  • exceptions_to_handle (tuple[type[BaseException],...]) – A tuple of exception types to handle.

  • exception_key (Optional[str]) – If string is specified then handled exceptions will be passedto fallbacks as part of the input under the specified key. If None,exceptions will not be passed to fallbacks. If used, the base Runnableand its fallbacks must accept a dictionary as input.

Returns:

A new Runnable that will try the original Runnable, and then eachfallback in order, upon failures.

Return type:

RunnableWithFallbacksT[Input, Output]

with_listeners(
*,
on_start:Callable[[Run],None]|Callable[[Run,RunnableConfig],None]|None=None,
on_end:Callable[[Run],None]|Callable[[Run,RunnableConfig],None]|None=None,
on_error:Callable[[Run],None]|Callable[[Run,RunnableConfig],None]|None=None,
)Runnable[Input,Output]#

Bind lifecycle listeners to a Runnable, returning a new Runnable.

on_start: Called before the Runnable starts running, with the Run object.on_end: Called after the Runnable finishes running, with the Run object.on_error: Called if the Runnable throws an error, with the Run object.

The Run object contains information about the run, including its id,type, input, output, error, start_time, end_time, and any tags or metadataadded to the run.

Parameters:
  • on_start (Optional[Union[Callable[[Run],None],Callable[[Run,RunnableConfig],None]]]) – Called before the Runnable starts running. Defaults to None.

  • on_end (Optional[Union[Callable[[Run],None],Callable[[Run,RunnableConfig],None]]]) – Called after the Runnable finishes running. Defaults to None.

  • on_error (Optional[Union[Callable[[Run],None],Callable[[Run,RunnableConfig],None]]]) – Called if the Runnable throws an error. Defaults to None.

Returns:

A new Runnable with the listeners bound.

Return type:

Runnable[Input, Output]

Example:

fromlangchain_core.runnablesimportRunnableLambdafromlangchain_core.tracers.schemasimportRunimporttimedeftest_runnable(time_to_sleep:int):time.sleep(time_to_sleep)deffn_start(run_obj:Run):print("start_time:",run_obj.start_time)deffn_end(run_obj:Run):print("end_time:",run_obj.end_time)chain=RunnableLambda(test_runnable).with_listeners(on_start=fn_start,on_end=fn_end)chain.invoke(2)
with_retry(*,retry_if_exception_type:tuple[type[BaseException],...]=(<class'Exception'>,),wait_exponential_jitter:bool=True,exponential_jitter_params:Optional[ExponentialJitterParams]=None,stop_after_attempt:int=3)Runnable[Input,Output]#

Create a new Runnable that retries the original Runnable on exceptions.

Parameters:
  • retry_if_exception_type (tuple[type[BaseException],...]) – A tuple of exception types to retry on.Defaults to (Exception,).

  • wait_exponential_jitter (bool) – Whether to add jitter to the waittime between retries. Defaults to True.

  • stop_after_attempt (int) – The maximum number of attempts to make beforegiving up. Defaults to 3.

  • exponential_jitter_params (Optional[ExponentialJitterParams]) – Parameters fortenacity.wait_exponential_jitter. Namely:initial,max,exp_base, andjitter (all float values).

Returns:

A new Runnable that retries the original Runnable on exceptions.

Return type:

Runnable[Input, Output]

Example:

fromlangchain_core.runnablesimportRunnableLambdacount=0def_lambda(x:int)->None:globalcountcount=count+1ifx==1:raiseValueError("x is 1")else:passrunnable=RunnableLambda(_lambda)try:runnable.with_retry(stop_after_attempt=2,retry_if_exception_type=(ValueError,),).invoke(1)exceptValueError:passassert(count==2)
with_structured_output(
schema:dict[str,Any]|type[_BM]|type|None=None,
*,
method:Literal['function_calling','json_mode','json_schema']='function_calling',
include_raw:bool=False,
strict:bool|None=None,
**kwargs:Any,
)Runnable[PromptValue|str|Sequence[BaseMessage|list[str]|tuple[str,str]|str|dict[str,Any]],dict|_BM][source]#

Model wrapper that returns outputs formatted to match the given schema.

Parameters:
  • schema (dict[str,Any]|type[_BM]|type |None) –

    The output schema. Can be passed in as:

    • an OpenAI function/tool schema,

    • a JSON Schema,

    • a TypedDict class (support added in 0.1.20),

    • or a Pydantic class.

    Ifschema is a Pydantic class then the model output will be aPydantic instance of that class, and the model-generated fields will bevalidated by the Pydantic class. Otherwise the model output will be adict and will not be validated. Seelangchain_core.utils.function_calling.convert_to_openai_tool()for more on how to properly specify types and descriptions ofschema fields when specifying a Pydantic or TypedDict class.

  • method (Literal['function_calling','json_mode','json_schema']) –

    The method for steering model generation, one of:

  • include_raw (bool) – IfFalse then only the parsed structured output is returned. Ifan error occurs during model output parsing it will be raised. IfTruethen both the raw model response (a BaseMessage) and the parsed modelresponse will be returned. If an error occurs during output parsing itwill be caught and returned as well. The final output is always a dictwith keys'raw','parsed', and'parsing_error'.

  • strict (bool |None) –

    • True:

      Model output is guaranteed to exactly match the schema.The input schema will also be validated according tothis schema.

    • False:

      Input schema will not be validated and model output will not bevalidated.

    • None:

      strict argument will not be passed to the model.

  • kwargs (Any) – Additional keyword args aren’t supported.

Returns:

A Runnable that takes same inputs as alangchain_core.language_models.chat.BaseChatModel.

Ifinclude_raw isFalse andschema is a Pydantic class, Runnable outputs an instance ofschema (i.e., a Pydantic object). Otherwise, ifinclude_raw isFalse then Runnable outputs a dict.

Ifinclude_raw isTrue, then Runnable outputs a dict with keys:

  • 'raw': BaseMessage

  • 'parsed': None if there was a parsing error, otherwise the type depends on theschema as described above.

  • 'parsing_error': Optional[BaseException]

Return type:

Runnable[PromptValue | str |Sequence[BaseMessage | list[str] | tuple[str, str] | str | dict[str,Any]], dict |_BM]

with_types(
*,
input_type:type[Input]|None=None,
output_type:type[Output]|None=None,
)Runnable[Input,Output]#

Bind input and output types to a Runnable, returning a new Runnable.

Parameters:
  • input_type (type[Input]|None) – The input type to bind to the Runnable. Defaults to None.

  • output_type (type[Output]|None) – The output type to bind to the Runnable. Defaults to None.

Returns:

A new Runnable with the types bound.

Return type:

Runnable[Input,Output]

    On this page