ChatHuggingFace
This will help you get started withlangchain_huggingface
chat models. For detailed documentation of allChatHuggingFace
features and configurations head to theAPI reference. For a list of models supported by Hugging Face check outthis page.
Overview
Integration details
Integration details
Class | Package | Local | Serializable | JS support | Package downloads | Package latest |
---|---|---|---|---|---|---|
ChatHuggingFace | langchain-huggingface | ✅ | beta | ❌ |
Model features
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ |
Setup
To access Hugging Face models you'll need to create a Hugging Face account, get an API key, and install thelangchain-huggingface
integration package.
Credentials
Generate aHugging Face Access Token and store it as an environment variable:HUGGINGFACEHUB_API_TOKEN
.
import getpass
import os
ifnot os.getenv("HUGGINGFACEHUB_API_TOKEN"):
os.environ["HUGGINGFACEHUB_API_TOKEN"]= getpass.getpass("Enter your token: ")
Installation
Class | Package | Local | Serializable | JS support | Package downloads | Package latest |
---|---|---|---|---|---|---|
ChatHuggingFace | langchain_huggingface | ✅ | ❌ | ❌ |
Model features
Tool calling | Structured output | JSON mode | Image input | Audio input | Video input | Token-level streaming | Native async | Token usage | Logprobs |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
Setup
To accesslangchain_huggingface
models you'll need to create a/anHugging Face
account, get an API key, and install thelangchain_huggingface
integration package.
Credentials
You'll need to have aHugging Face Access Token saved as an environment variable:HUGGINGFACEHUB_API_TOKEN
.
import getpass
import os
os.environ["HUGGINGFACEHUB_API_TOKEN"]= getpass.getpass(
"Enter your Hugging Face API key: "
)
%pip install--upgrade--quiet langchain-huggingface text-generation transformers google-search-results numexpr langchainhub sentencepiece jinja2 bitsandbytes accelerate
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
Instantiation
You can instantiate aChatHuggingFace
model in two different ways, either from aHuggingFaceEndpoint
or from aHuggingFacePipeline
.
HuggingFaceEndpoint
from langchain_huggingfaceimport ChatHuggingFace, HuggingFaceEndpoint
llm= HuggingFaceEndpoint(
repo_id="deepseek-ai/DeepSeek-R1-0528",
task="text-generation",
max_new_tokens=512,
do_sample=False,
repetition_penalty=1.03,
provider="auto",# let Hugging Face choose the best provider for you
)
chat_model= ChatHuggingFace(llm=llm)
The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/isaachershenson/.cache/huggingface/token
Login successful
Now let's take advantage ofInference Providers to run the model on specific third-party providers
llm= HuggingFaceEndpoint(
repo_id="deepseek-ai/DeepSeek-R1-0528",
task="text-generation",
provider="hyperbolic",# set your provider here
# provider="nebius",
# provider="together",
)
chat_model= ChatHuggingFace(llm=llm)
HuggingFacePipeline
from langchain_huggingfaceimport ChatHuggingFace, HuggingFacePipeline
llm= HuggingFacePipeline.from_model_id(
model_id="HuggingFaceH4/zephyr-7b-beta",
task="text-generation",
pipeline_kwargs=dict(
max_new_tokens=512,
do_sample=False,
repetition_penalty=1.03,
),
)
chat_model= ChatHuggingFace(llm=llm)
config.json: 0%| | 0.00/638 [00:00<?, ?B/s]
model.safetensors.index.json: 0%| | 0.00/23.9k [00:00<?, ?B/s]
Downloading shards: 0%| | 0/8 [00:00<?, ?it/s]
model-00001-of-00008.safetensors: 0%| | 0.00/1.89G [00:00<?, ?B/s]
model-00002-of-00008.safetensors: 0%| | 0.00/1.95G [00:00<?, ?B/s]
model-00003-of-00008.safetensors: 0%| | 0.00/1.98G [00:00<?, ?B/s]
model-00004-of-00008.safetensors: 0%| | 0.00/1.95G [00:00<?, ?B/s]
model-00005-of-00008.safetensors: 0%| | 0.00/1.98G [00:00<?, ?B/s]
model-00006-of-00008.safetensors: 0%| | 0.00/1.95G [00:00<?, ?B/s]
model-00007-of-00008.safetensors: 0%| | 0.00/1.98G [00:00<?, ?B/s]
model-00008-of-00008.safetensors: 0%| | 0.00/816M [00:00<?, ?B/s]
Loading checkpoint shards: 0%| | 0/8 [00:00<?, ?it/s]
generation_config.json: 0%| | 0.00/111 [00:00<?, ?B/s]
Instatiating with Quantization
To run a quantized version of your model, you can specify abitsandbytes
quantization config as follows:
from transformersimport BitsAndBytesConfig
quantization_config= BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype="float16",
bnb_4bit_use_double_quant=True,
)
and pass it to theHuggingFacePipeline
as a part of itsmodel_kwargs
:
llm= HuggingFacePipeline.from_model_id(
model_id="HuggingFaceH4/zephyr-7b-beta",
task="text-generation",
pipeline_kwargs=dict(
max_new_tokens=512,
do_sample=False,
repetition_penalty=1.03,
return_full_text=False,
),
model_kwargs={"quantization_config": quantization_config},
)
chat_model= ChatHuggingFace(llm=llm)
Invocation
from langchain_core.messagesimport(
HumanMessage,
SystemMessage,
)
messages=[
SystemMessage(content="You're a helpful assistant"),
HumanMessage(
content="What happens when an unstoppable force meets an immovable object?"
),
]
ai_msg= chat_model.invoke(messages)
print(ai_msg.content)
According to the popular phrase and hypothetical scenario, when an unstoppable force meets an immovable object, a paradoxical situation arises as both forces are seemingly contradictory. On one hand, an unstoppable force is an entity that cannot be stopped or prevented from moving forward, while on the other hand, an immovable object is something that cannot be moved or displaced from its position.
In this scenario, it is un
API reference
For detailed documentation of allChatHuggingFace
features and configurations head to the API reference:https://python.langchain.com/api_reference/huggingface/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html
API reference
For detailed documentation of all ChatHuggingFace features and configurations head to the API reference:https://python.langchain.com/api_reference/huggingface/chat_models/langchain_huggingface.chat_models.huggingface.ChatHuggingFace.html
Related
- Chat modelconceptual guide
- Chat modelhow-to guides