Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Reranking

Continuing from the previous section, we can now rerank the results using more complex rerankers.

Try it yourself:Open In Colab

Reranking search results

You can rerank any search results using a reranker. The syntax for reranking is as follows:

fromlancedb.rerankersimportLinearCombinationRerankerreranker=LinearCombinationReranker()table.search(quries[0],query_type="hybrid").rerank(reranker=reranker).limit(5).to_pandas()
Based on thequery_type, thererank() function can accept other arguments as well. For example, hybrid search accepts anormalize param to determine the score normalization method.

Note

LanceDB provides aReranker base class that can be extended to implement custom rerankers. Each reranker must implement thererank_hybrid method.rerank_vector andrerank_fts methods are optional. For example, theLinearCombinationReranker only implements thererank_hybrid method and so it can only be used for reranking hybrid search results.

Choosing a Reranker

There are many rerankers available in LanceDB likeCrossEncoderReranker,CohereReranker, andColBERT. The choice of reranker depends on the dataset and the application. You can even implement you own custom reranker by extending theReranker class. For more details about each available reranker and performance comparison, refer to thererankers documentation.

In this example, we'll use theCohereReranker to rerank the search results. It requirescohere to be installed andCOHERE_API_KEY to be set in the environment. To get your API key, sign up onCohere.

fromlancedb.rerankersimportCohereReranker# use Cohere reranker v3reranker=CohereReranker(model_name="rerank-english-v3.0")# default model is "rerank-english-v2.0"

Reranking search results

Now we can rerank all query type results using theCohereReranker:

# rerank hybrid search resultstable.search(quries[0],query_type="hybrid").rerank(reranker=reranker).limit(5).to_pandas()# rerank vector search resultstable.search(quries[0],query_type="vector").rerank(reranker=reranker).limit(5).to_pandas()# rerank fts search resultstable.search(quries[0],query_type="fts").rerank(reranker=reranker).limit(5).to_pandas()

Each reranker can accept additional arguments. For example,CohereReranker acceptstop_k andbatch_size params to control the number of documents to rerank and the batch size for reranking respectively. Similarly, a custom reranker can accept any number of arguments based on the implementation. For example, a reranker can accept afilter that implements some custom logic to filter out documents before reranking.

Results

Let us take a look at the same datasets from the previous sections, using the same embedding table but with Cohere reranker applied to all query types.

Note

When reranking fts or vector search results, the search results are over-fetched by a factor of 2 and then reranked. From the reranked set,top_k (5 in this case) results are taken. This is done because reranking will have no effect on the hit-rate if we only fetch thetop_k results.

Synthetic LLama2 paper dataset

Query TypeHit-rate@5
Vector0.640
FTS0.595
Reranked vector0.677
Reranked fts0.672
Hybrid0.759

Uber10K sec filing Dataset

Query TypeHit-rate@5
Vector0.608
FTS0.824
Reranked vector0.671
Reranked fts0.843
Hybrid0.849

[8]ページ先頭

©2009-2025 Movatter.jp