Async programming with langchain

Prerequisites

LLM based applications often involve a lot of I/O-bound operations, such as making API calls to language models, databases, or other services. Asynchronous programming (or async programming) is a paradigm that allows a program to perform multiple tasks concurrently without blocking the execution of other tasks, improving efficiency and responsiveness, particularly in I/O-bound operations.

note

You are expected to be familiar with asynchronous programming in Python before reading this guide. If you are not, please find appropriate resources online to learn how to program asynchronously in Python.This guide specifically focuses on what you need to know to work with LangChain in an asynchronous context, assuming that you are already familiar with asynchronous programming.

Langchain asynchronous APIs

Many LangChain APIs are designed to be asynchronous, allowing you to build efficient and responsive applications.

Typically, any method that may perform I/O operations (e.g., making API calls, reading files) will have an asynchronous counterpart.

In LangChain, async implementations are located in the same classes as their synchronous counterparts, with the asynchronous methods having an "a" prefix. For example, the synchronousinvoke method has an asynchronous counterpart calledainvoke.

Many components of LangChain implement theRunnable Interface, which includes support for asynchronous execution. This means that you can run Runnables asynchronously using theawait keyword in Python.

await some_runnable.ainvoke(some_input)

Other components likeEmbedding Models andVectorStore that do not implement theRunnable Interface usually still follow the same rule and include the asynchronous version of method in the same class with an "a" prefix.

For example,

await some_vectorstore.aadd_documents(documents)

Runnables created using theLangChain Expression Language (LCEL) can also be run asynchronously as they implementthe fullRunnable Interface.

For more information, please review theAPI reference for the specific component you are using.

Delegation to sync methods

Most popular LangChain integrations implement asynchronous support of their APIs. For example, theainvoke method of many ChatModel implementations uses thehttpx.AsyncClient to make asynchronous HTTP requests to the model provider's API.

When an asynchronous implementation is not available, LangChain tries to provide a default implementation, even if it incursaslight overhead.

By default, LangChain will delegate the execution of unimplemented asynchronous methods to the synchronous counterparts. LangChain almost always assumes that the synchronous method should be treated as a blocking operation and should be run in a separate thread.This is done usingasyncio.loop.run_in_executor functionality provided by theasyncio library. LangChain uses the default executor provided by theasyncio library, which lazily initializes a thread pool executor with a default number of threads that is reused in the given event loop. While this strategy incurs a slight overhead due to context switching between threads, it guarantees that every asynchronous method has a default implementation that works out of the box.

Performance

Async code in LangChain should generally perform relatively well with minimal overhead out of the box, and is unlikelyto be a bottleneck in most applications.

The two main sources of overhead are:

Cost of context switching between threads whendelegating to synchronous methods. This can be addressed by providing a native asynchronous implementation.
InLCEL any "cheap functions" that appear as part of the chain will be either scheduled as tasks on the event loop (if they are async) or run in a separate thread (if they are sync), rather than just be run inline.

The latency overhead you should expect from these is between tens of microseconds to a few milliseconds.

A more common source of performance issues arises from users accidentally blocking the event loop by calling synchronous code in an async context (e.g., callinginvoke rather thanainvoke).

Compatibility

LangChain is only compatible with theasyncio library, which is distributed as part of the Python standard library. It will not work with other async libraries liketrio orcurio.

In Python 3.9 and 3.10,asyncio's tasks did notaccept acontext parameter. Due to this limitation, LangChain cannot automatically propagate theRunnableConfig down the call chainin certain scenarios.

If you are experiencing issues with streaming, callbacks or tracing in async code and are using Python 3.9 or 3.10, this is a likely cause.

Please readPropagation RunnableConfig for more details to learn how to propagate theRunnableConfig down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).

How to use in ipython and jupyter notebooks

As of IPython 7.0, IPython supports asynchronous REPLs. This means that you can use theawait keyword in the IPython REPL and Jupyter Notebooks without any additional setup. For more information, see theIPython blog post.

Movatterモバイル変換

Langchain asynchronous APIs​

Delegation to sync methods​

Performance​

Compatibility​

How to use in ipython and jupyter notebooks​

Langchain asynchronous APIs

Delegation to sync methods

Performance

Compatibility

How to use in ipython and jupyter notebooks