DeepInfra

DeepInfra is a serverless inference as a service that provides access to avariety of LLMs andembeddings models. This notebook goes over how to use LangChain with DeepInfra for chat models.

Set the Environment API Key

Make sure to get your API key from DeepInfra. You have toLogin and get a new token.

You are given a 1 hour free of serverless GPU compute to test different models. (seehere)You can print your token withdeepctl auth token

# get a new token: https://deepinfra.com/login?from=%2Fdash

import os
from getpassimport getpass

from langchain_community.chat_modelsimport ChatDeepInfra
from langchain_core.messagesimport HumanMessage

DEEPINFRA_API_TOKEN= getpass()

# or pass deepinfra_api_token parameter to the ChatDeepInfra constructor
os.environ["DEEPINFRA_API_TOKEN"]= DEEPINFRA_API_TOKEN

chat= ChatDeepInfra(model="meta-llama/Llama-2-7b-chat-hf")

messages=[
    HumanMessage(
        content="Translate this sentence from English to French. I love programming."
)
]
chat.invoke(messages)

API Reference:ChatDeepInfra |HumanMessage

`ChatDeepInfra` also supports async and streaming functionality:

from langchain_core.callbacksimport StreamingStdOutCallbackHandler

API Reference:StreamingStdOutCallbackHandler

await chat.agenerate([messages])

chat= ChatDeepInfra(
    streaming=True,
    verbose=True,
    callbacks=[StreamingStdOutCallbackHandler()],
)
chat.invoke(messages)

Tool Calling

DeepInfra currently supports only invoke and async invoke tool calling.

For a complete list of models that support tool calling, please refer to ourtool calling documentation.

import asyncio

from dotenvimport find_dotenv, load_dotenv
from langchain_community.chat_modelsimport ChatDeepInfra
from langchain_core.messagesimport HumanMessage
from langchain_core.toolsimport tool
from pydanticimport BaseModel

model_name="meta-llama/Meta-Llama-3-70B-Instruct"

_= load_dotenv(find_dotenv())


# Langchain tool
@tool
deffoo(something):
"""
    Called when foo
    """
pass


# Pydantic class
classBar(BaseModel):
"""
    Called when Bar
    """

pass


llm= ChatDeepInfra(model=model_name)
tools=[foo, Bar]
llm_with_tools= llm.bind_tools(tools)
messages=[
    HumanMessage("Foo and bar, please."),
]

response= llm_with_tools.invoke(messages)
print(response.tool_calls)
# [{'name': 'foo', 'args': {'something': None}, 'id': 'call_Mi4N4wAtW89OlbizFE1aDxDj'}, {'name': 'Bar', 'args': {}, 'id': 'call_daiE0mW454j2O1KVbmET4s2r'}]


asyncdefcall_ainvoke():
    result=await llm_with_tools.ainvoke(messages)
print(result.tool_calls)


# Async call
asyncio.run(call_ainvoke())
# [{'name': 'foo', 'args': {'something': None}, 'id': 'call_ZH7FetmgSot4LHcMU6CEb8tI'}, {'name': 'Bar', 'args': {}, 'id': 'call_2MQhDifAJVoijZEvH8PeFSVB'}]

API Reference:ChatDeepInfra |HumanMessage |tool

Chat modelconceptual guide
Chat modelhow-to guides

Movatterモバイル変換

DeepInfra

Set the Environment API Key

`ChatDeepInfra` also supports async and streaming functionality:

Tool Calling

Related

Movatterモバイル変換

Set the Environment API Key​

ChatDeepInfra also supports async and streaming functionality:​

Tool Calling

Related​

Set the Environment API Key

`ChatDeepInfra` also supports async and streaming functionality:

Related