Xorbits inference (Xinference)

This notebook goes over how to use Xinference embeddings within LangChain

Installation

InstallXinference through PyPI:

%pip install--upgrade--quiet"xinference[all]"

Deploy Xinference Locally or in a Distributed Cluster.

For local deployment, runxinference.

To deploy Xinference in a cluster, first start an Xinference supervisor using thexinference-supervisor. You can also use the option -p to specify the port and -H to specify the host. The default port is 9997.

Then, start the Xinference workers usingxinference-worker on each server you want to run them on.

You can consult the README file fromXinference for more information.

Wrapper

To use Xinference with LangChain, you need to first launch a model. You can use command line interface (CLI) to do so:

!xinference launch-n vicuna-v1.3-f ggmlv3-q q4_0

Model uid: 915845ee-2a04-11ee-8ed4-d29396a3f064

A model UID is returned for you to use. Now you can use Xinference embeddings with LangChain:

from langchain_community.embeddingsimport XinferenceEmbeddings

xinference= XinferenceEmbeddings(
    server_url="http://0.0.0.0:9997", model_uid="915845ee-2a04-11ee-8ed4-d29396a3f064"
)

API Reference:XinferenceEmbeddings

query_result= xinference.embed_query("This is a test query")

doc_result= xinference.embed_documents(["text A","text B"])

Lastly, terminate the model when you do not need to use it:

!xinference terminate--model-uid"915845ee-2a04-11ee-8ed4-d29396a3f064"

Embedding modelconceptual guide
Embedding modelhow-to guides

Movatterモバイル変換

Xorbits inference (Xinference)

Installation

Deploy Xinference Locally or in a Distributed Cluster.

Wrapper

Related

Movatterモバイル変換

Installation​

Deploy Xinference Locally or in a Distributed Cluster.​

Wrapper​

Related​

Installation

Deploy Xinference Locally or in a Distributed Cluster.

Wrapper

Related