Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
OurBuilding Ambient Agents with LangGraph course is now available on LangChain Academy!
Open In ColabOpen on GitHub

Embedding Documents using Optimized and Quantized Embedders

Embedding all documents using Quantized Embedders.

The embedders are based on optimized models, created by usingoptimum-intel andIPEX.

Example text is based onSBERT.

from langchain_community.embeddingsimport QuantizedBiEncoderEmbeddings

model_name="Intel/bge-small-en-v1.5-rag-int8-static"
encode_kwargs={"normalize_embeddings":True}# set True to compute cosine similarity

model= QuantizedBiEncoderEmbeddings(
model_name=model_name,
encode_kwargs=encode_kwargs,
query_instruction="Represent this sentence for searching relevant passages: ",
)
loading configuration file inc_config.json from cache at
INCConfig {
"distillation": {},
"neural_compressor_version": "2.4.1",
"optimum_version": "1.16.2",
"pruning": {},
"quantization": {
"dataset_num_samples": 50,
"is_static": true
},
"save_onnx_model": false,
"torch_version": "2.2.0",
"transformers_version": "4.37.2"
}

Using `INCModel` to load a TorchScript model will be deprecated in v1.15.0, to load your model please use `IPEXModel` instead.

Let's ask a question, and compare to 2 documents. The first contains the answer to the question, and the second one does not.

We can check better suits our query.

question="How many people live in Berlin?"
documents=[
"Berlin had a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.",
"Berlin is well known for its museums.",
]
doc_vecs= model.embed_documents(documents)
Batches: 100%|██████████| 1/1 [00:00<00:00,  4.18it/s]
query_vec= model.embed_query(question)
import torch
doc_vecs_torch= torch.tensor(doc_vecs)
query_vec_torch= torch.tensor(query_vec)
query_vec_torch @ doc_vecs_torch.T
tensor([0.7980, 0.6529])

We can see that indeed the first one ranks higher.

Related


[8]ページ先頭

©2009-2025 Movatter.jp