Rankers

目录

Rankers#

The ranker is the component that determines the order of the results returned by the retriever. FlexRAG provides several rankers that can be used to sort the results based on various criteria.

class flexrag.ranker.RankerBaseConfig(reserve_num=-1, ranking_field=None)[源代码]#

The configuration for the ranker.

参数:
  • reserve_num (int) -- the number of candidates to reserve. If it is less than 0, all candidates will be reserved. Default is -1.

  • ranking_field (Optional[str]) -- the field name of the ranking field in the retrieved context. If it is None, the ranker will only accept a list of strings as candidates.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.RankerBase(cfg)[源代码]#
async async_rank(query, candidates)[源代码]#

The asynchronous version of rank.

rank(query, candidates)[源代码]#

Rank the candidates based on the query.

参数:
  • query (str) -- query string.

  • candidates (list[str]) -- list of candidate strings.

返回:

indices and scores of the ranked candidates.

返回类型:

tuple[np.ndarray, np.ndarray]

class flexrag.ranker.RankingResult(query, candidates, scores=None)[源代码]#

The result of ranking.

参数:
  • query (str) -- the query string. Required.

  • candidates (list[RetrievedContext | str]) -- the ranked candidates. The results are sorted in descending order by relevance. Required.

  • scores (Optional[list[float]]) -- the scores of the ranked candidates. Optional.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.RankerConfig(ranker_type=None, cohere_config=<factory>, rank_gpt_config=<factory>, hf_cross_encoder_config=<factory>, hf_seq2seq_config=<factory>, hf_colbert_config=<factory>, jina_config=<factory>, mixedbread_config=<factory>, voyage_config=<factory>)#

Configuration class for ranker (name: RankerConfig, default: None).

参数:

Local Ranker#

class flexrag.ranker.HFCrossEncoderRankerConfig(model_path=None, tokenizer_path=None, trust_remote_code=False, device_id=<factory>, load_dtype='auto', reserve_num=-1, ranking_field=None, max_encode_length=512)[源代码]#

基类:RankerBaseConfig, HFModelConfig

The configuration for the HuggingFace Cross Encoder ranker.

参数:

max_encode_length (int) -- the maximum length for the input encoding. Default is 512.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.HFCrossEncoderRanker(cfg)[源代码]#

基类:RankerBase

HFCrossEncoderRanker: The ranker based on the HuggingFace Cross Encoder model.

class flexrag.ranker.HFSeq2SeqRankerConfig(model_path=None, tokenizer_path=None, trust_remote_code=False, device_id=<factory>, load_dtype='auto', reserve_num=-1, ranking_field=None, max_encode_length=512, input_template='Query: {query} Document: {candidate} Relevant:', positive_token='▁true', negative_token='▁false')[源代码]#

基类:RankerBaseConfig, HFModelConfig

The configuration for the HuggingFace Sequence-to-Sequence ranker.

参数:
  • max_encode_length (int) -- the maximum length for the input encoding. Default is 512.

  • input_template (str) -- the input template for the seq2seq model. Default is "Query: {query} Document: {candidate} Relevant:".

  • positive_token (str) -- the positive token for the seq2seq model. Default is "▁true".

  • negative_token (str) -- the negative token for the seq2seq model. Default is "▁false".

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.HFSeq2SeqRanker(cfg)[源代码]#

基类:RankerBase

HFSeq2SeqRanker: The ranker based on the HuggingFace Sequence-to-Sequence model.

class flexrag.ranker.HFColBertRankerConfig(model_path=None, tokenizer_path=None, trust_remote_code=False, device_id=<factory>, load_dtype='auto', reserve_num=-1, ranking_field=None, base_model_type='bert', output_dim=128, max_encode_length=512, query_token='[unused0]', document_token='[unused1]', normalize_embeddings=True)[源代码]#

基类:RankerBaseConfig, HFModelConfig

The configuration for the HuggingFace ColBERT ranker.

参数:
  • base_model_type (str) -- the base model type for the ColBERT model. Default is "bert".

  • output_dim (int) -- the output dimension for the ColBERT model. Default is 128.

  • max_encode_length (int) -- the maximum length for the input encoding. Default is 512.

  • query_token (str) -- the query token for the ColBERT model. Default is "[unused0]".

  • document_token (str) -- the document token for the ColBERT model. Default is "[unused1]".

  • normalize_embeddings (bool) -- whether to normalize the embeddings. Default is True.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.HFColBertRanker(cfg)[源代码]#

基类:RankerBase

HFColBertRanker: The ranker based on the HuggingFace ColBERT model. Code adapted from hotchpotch/JQaRA

class flexrag.ranker.RankGPTRankerConfig(generator_type=None, anthropic_config=<factory>, hf_config=<factory>, hf_vlm_config=<factory>, ollama_config=<factory>, openai_config=<factory>, vllm_config=<factory>, reserve_num=-1, ranking_field=None, step_size=10, window_size=20, max_chunk_size=300)[源代码]#

基类:RankerBaseConfig, GeneratorConfig

The configuration for the RankGPT ranker.

参数:
  • step_size (int) -- the step size for the slide window ranking. Default is 10.

  • window_size (int) -- the window size for the slide window ranking. Default is 20.

  • max_chunk_size (int) -- the maximum chunk size for the slide window ranking. Default is 300.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.RankGPTRanker(cfg)[源代码]#

基类:RankerBase

RankGPTRanker: Rank the candidates based on the query using the Large Language model. Code was adapted from the original implementation from sunnweiwei/RankGPT

Oneline Ranker#

class flexrag.ranker.CohereRankerConfig(reserve_num=-1, ranking_field=None, model='rerank-v3.5', base_url=None, api_key=None, proxy=None)[源代码]#

基类:RankerBaseConfig

The configuration for the Cohere ranker.

参数:
  • model (str) -- the model name of the ranker. Default is "rerank-multilingual-v3.0".

  • base_url (Optional[str]) -- the base URL of the Cohere ranker. Default is None.

  • api_key (str) -- the API key for the Cohere ranker. If not provided, it will use the environment variable COHERE_API_KEY. Defaults to None.

  • proxy (Optional[str]) -- the proxy for the request. Default is None.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.CohereRanker(cfg)[源代码]#

基类:RankerBase

CohereRanker: The ranker based on the Cohere API.

class flexrag.ranker.JinaRankerConfig(reserve_num=-1, ranking_field=None, model='jina-reranker-v2-base-multilingual', base_url='https://api.jina.ai/v1/rerank', api_key=None, proxy=None)[源代码]#

基类:RankerBaseConfig

The configuration for the Jina ranker.

参数:
  • model (str) -- the model name of the ranker. Default is "jina-reranker-v2-base-multilingual".

  • base_url (str) -- the base URL of the Jina ranker. Default is "https://api.jina.ai/v1/rerank".

  • api_key (str) -- the API key for the Jina ranker. If not provided, it will use the environment variable JINA_API_KEY. Defaults to None.

  • proxy (Optional[str]) -- The proxy to use. Defaults to None.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.JinaRanker(cfg)[源代码]#

基类:RankerBase

JinaRanker: The ranker based on the Jina API.

class flexrag.ranker.MixedbreadRankerConfig(reserve_num=-1, ranking_field=None, model='mxbai-rerank-base-v2', base_url=None, api_key=None, proxy=None)[源代码]#

基类:RankerBaseConfig

The configuration for the Mixedbread ranker.

参数:
  • model (str) -- the model name of the ranker. Default is "mxbai-rerank-base-v2".

  • api_key (str) -- the API key for the Mixedbread ranker. If not provided, it will use the environment variable MIXEDBREAD_API_KEY. Defaults to None.

  • base_url (Optional[str]) -- the base URL of the Mixedbread ranker. Default is None.

  • proxy (Optional[str]) -- the proxy for the request. Default is None.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.MixedbreadRanker(cfg)[源代码]#

基类:RankerBase

MixedbreadRanker: The ranker based on the Mixedbread API.

class flexrag.ranker.VoyageRankerConfig(reserve_num=-1, ranking_field=None, model='rerank-2', api_key=None, timeout=3.0, max_retries=3)[源代码]#

基类:RankerBaseConfig

The configuration for the Voyage ranker.

参数:
  • model (str) -- the model name of the ranker. Default is "rerank-2".

  • api_key (str) -- the API key for the Voyage ranker. If not provided, it will use the environment variable VOYAGE_API_KEY. Defaults to None.

  • timeout (float) -- the timeout for the request. Default is 3.0.

  • max_retries (int) -- the maximum number of retries. Default is 3.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.ranker.VoyageRanker(cfg)[源代码]#

基类:RankerBase

VoyageRanker: The ranker based on the Voyage API.