Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Building Custom Rerankers

Building Custom Rerankers

You can build your own custom reranker by subclassing theReranker class and implementing thererank_hybrid() method. Optionally, you can also implement thererank_vector() andrerank_fts() methods if you want to support reranking for vector and FTS search separately.

TheReranker base interface comes with amerge_results() method that can be used to combine the results of semantic and full-text search. This is a vanilla merging algorithm that simply concatenates the results and removes the duplicates without taking the scores into consideration. It only keeps the first copy of the row encountered. This works well in cases that don't require the scores of semantic and full-text search to combine the results. If you want to use the scores or want to supportreturn_score="all", you'll need to implement your own merging algorithm.

Here's an example of a custom reranker that combines the results of semantic and full-text search using a linear combination of the scores:

fromlancedb.rerankersimportRerankerimportpyarrowaspaclassMyReranker(Reranker):def__init__(self,param1,param2,...,return_score="relevance"):super().__init__(return_score)self.param1=param1self.param2=param2defrerank_hybrid(self,query:str,vector_results:pa.Table,fts_results:pa.Table):# Use the built-in merging functioncombined_result=self.merge_results(vector_results,fts_results)# Do something with the combined results# ...# Return the combined resultsreturncombined_resultdefrerank_vector(self,query:str,vector_results:pa.Table):# Do something with the vector results# ...# Return the vector resultsreturnvector_resultsdefrerank_fts(self,query:str,fts_results:pa.Table):# Do something with the FTS results# ...# Return the FTS resultsreturnfts_results

Example of a Custom Reranker

For the sake of simplicity let's build custom reranker that enhances the Cohere Reranker by accepting a filter query, and accepts other CohereReranker params as kwargs.

fromtypingimportList,Unionimportpandasaspdfromlancedb.rerankersimportCohereRerankerclassModifiedCohereReranker(CohereReranker):def__init__(self,filters:Union[str,List[str]],**kwargs):super().__init__(**kwargs)filters=filtersifisinstance(filters,list)else[filters]self.filters=filtersdefrerank_hybrid(self,query:str,vector_results:pa.Table,fts_results:pa.Table)->pa.Table:combined_result=super().rerank_hybrid(query,vector_results,fts_results)df=combined_result.to_pandas()forfilterinself.filters:df=df.query("not text.str.contains(@filter)")returnpa.Table.from_pandas(df)defrerank_vector(self,query:str,vector_results:pa.Table)->pa.Table:vector_results=super().rerank_vector(query,vector_results)df=vector_results.to_pandas()forfilterinself.filters:df=df.query("not text.str.contains(@filter)")returnpa.Table.from_pandas(df)defrerank_fts(self,query:str,fts_results:pa.Table)->pa.Table:fts_results=super().rerank_fts(query,fts_results)df=fts_results.to_pandas()forfilterinself.filters:df=df.query("not text.str.contains(@filter)")returnpa.Table.from_pandas(df)

Tip

Thevector_results andfts_results are pyarrow tables. Lean more about pyarrow tableshere. It can be converted to other data types like pandas dataframe, pydict, pylist etc.

For example, You can convert them to pandas dataframes usingto_pandas() method and perform any operations you want. After you are done, you can convert the dataframe back to pyarrow table usingpa.Table.from_pandas() method and return it.


[8]ページ先頭

©2009-2025 Movatter.jp