Text Embeddings Inference

Hugging Face Text Embeddings Inference (TEI) is a toolkit for deploying and serving open-sourcetext embeddings and sequence classification models.TEI enables high-performance extraction for the most popular models,includingFlagEmbedding,Ember,GTE andE5.

To use it within langchain, first installhuggingface-hub.

%pip install--upgrade huggingface-hub

Then expose an embedding model using TEI. For instance, using Docker, you can serveBAAI/bge-large-en-v1.5 as follows:

model=BAAI/bge-large-en-v1.5
revision=refs/pr/5
volume=$PWD/data # share a volume with the Docker container to avoid downloading weights every run

docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:0.6 --model-id $model --revision $revision

Specifics on Docker usage might vary with the underlying hardware. For example, to serve the model on Intel Gaudi/Gaudi2 hardware, refer to thetei-gaudi repository for the relevant docker run command.

Finally, instantiate the client and embed your texts.

from langchain_huggingface.embeddingsimport HuggingFaceEndpointEmbeddings

API Reference:HuggingFaceEndpointEmbeddings

embeddings= HuggingFaceEndpointEmbeddings(model="http://localhost:8080")

text="What is deep learning?"

query_result= embeddings.embed_query(text)
query_result[:3]

[0.018113142, 0.00302585, -0.049911194]

doc_result= embeddings.embed_documents([text])

doc_result[0][:3]

[0.018113142, 0.00302585, -0.049911194]

Embedding modelconceptual guide
Embedding modelhow-to guides

Movatterモバイル変換

Text Embeddings Inference

Related

Movatterモバイル変換

Related​

Related