Assistant#

The Assistant class serves as an abstraction for Retrieval-Augmented Generation (RAG) behavior. It takes the user’s query as input and returns an appropriate response. This class provides a flexible interface for defining how the assistant handles queries, including whether a retrieval step is required, how the retrieval should be conducted, and how the assistant generates the response based on the retrieved information.

The Assistant Interface#

AssistantBase is the base class for all assistants. It provides a simple interface for answering a user query. The answering process is controlled by a configuration object that is passed to the assistant’s constructor.

class flexrag.assistant.AssistantBase[source]#

abstract answer(question)[source]#

Answer the given question.

Parameters:: question (str) – The question to answer.
Returns:: A tuple containing the following elements: - The response to the question. - The contexts used to answer the question. - The metadata of the assistant.
Return type:: tuple[str, Optional[list[RetrievedContext]], Optional[dict]]

class flexrag.assistant.AssistantConfig(assistant_type=None, basic_config=<factory>, modular_config=<factory>, chatqa_config=<factory>, jina_deepsearch_config=<factory>, perplexity_config=<factory>)#

Bases: object

Configuration class for assistant (name: AssistantConfig, default: None).

Parameters:

assistant_type (str) – The assistant type to use.
basic_config (BasicAssistantConfig) – The config for BasicAssistant.
modular_config (ModularAssistantConfig) – The config for ModularAssistant.
chatqa_config (ModularAssistantConfig) – The config for ChatQAAssistant.
jina_deepsearch_config (JinaDeepSearchConfig) – The config for JinaDeepSearch.
perplexity_config (PerplexityAssistantConfig) – The config for PerplexityAssistant.

AssistantConfig is the general configuration for all registered Assistant. You can load any Assistant by specifying the assistant_type in the configuration. For example, to load the BasicAssistant, you can use the following configuration:

from flexrag.assistant import AssistantConfig, ASSISTANTS, BasicAssistantConfig
from flexrag.models import OpenAIGeneratorConfig

config = AssistantConfig(
    assistant_type="basic",
    basic_config=BasicAssistantConfig(
        generator_type="openai",
        openai_config=OpenAIGeneratorConfig(
            model_name="Qwen2-7B-Instruct",
            base_url="http://127.0.1:8000/v1",
        ),
    ),
)
assistant = ASSISTANTS.load(config)

FlexRAG Assistants#

FlexRAG provides several assistant implementations that can be used out of the box. These implementations are designed to be flexible and extensible, allowing users to customize the assistant’s behavior by providing their own retrieval and generation components.

class flexrag.assistant.BasicAssistantConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=<factory>, generator_type=None, anthropic_config=<factory>, hf_config=<factory>, hf_vlm_config=<factory>, ollama_config=<factory>, openai_config=<factory>, vllm_config=<factory>, prompt_path=None, use_history=False)[source]#

Bases: GeneratorConfig, GenerationConfig

The configuration for the basic assistant.

Parameters:

prompt_path (str, optional) – The path to the prompt file. Defaults to None.
use_history (bool, optional) – Whether to save the chat history for multi-turn conversation. Defaults to False.

dump(path)#: Dump the dataclass to a YAML file.

dumps()#: Dump the dataclass to a YAML string.

classmethod load(path)#: Load the dataclass from a YAML file.

classmethod loads(s)#: Load the dataclass from a YAML string.

class flexrag.assistant.BasicAssistant(cfg)[source]#

Bases: AssistantBase

A basic assistant that generates response without retrieval.

class flexrag.assistant.ModularAssistantConfig(refiner_type=<factory>, context_arranger_config=<factory>, abstractive_summarizer_config=<factory>, extractive_summarizer_config=<factory>, ranker_type=None, cohere_config=<factory>, rank_gpt_config=<factory>, hf_cross_encoder_config=<factory>, hf_seq2seq_config=<factory>, hf_colbert_config=<factory>, jina_config=<factory>, mixedbread_config=<factory>, voyage_config=<factory>, retriever_type=None, elastic_config=<factory>, flex_config=<factory>, hyde_config=<factory>, typesense_config=<factory>, simple_web_config=<factory>, wikipedia_config=<factory>, do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=<factory>, generator_type=None, anthropic_config=<factory>, hf_config=<factory>, hf_vlm_config=<factory>, ollama_config=<factory>, openai_config=<factory>, vllm_config=<factory>, response_type='short', prompt_with_context_path=None, prompt_without_context_path=None, used_fields=<factory>)[source]#

Bases: GeneratorConfig, GenerationConfig, RetrieverConfig, RankerConfig, RefinerConfig

The configuration for the modular assistant.

Parameters:

response_type (str, optional) – The type of response to generate. Defaults to “short”. Available options are: “short”, “long”, “original”, “custom”.
prompt_with_context_path (str, optional) – The path to the prompt file for response with context. Defaults to None.
prompt_without_context_path (str, optional) – The path to the prompt file for response without context. Defaults to None.
used_fields (list[str], optional) – The fields to use in the context. Defaults to [].

dump(path)#: Dump the dataclass to a YAML file.

dumps()#: Dump the dataclass to a YAML string.

classmethod load(path)#: Load the dataclass from a YAML file.

classmethod loads(s)#: Load the dataclass from a YAML string.

class flexrag.assistant.ModularAssistant(cfg)[source]#

Bases: AssistantBase

The modular RAG assistant that supports retrieval, reranking, and generation.

class flexrag.assistant.ChatQAAssistant(cfg)[source]#

Bases: ModularAssistant

The Modular assistant that employs the ChatQA model for response generation.

Assistant

Contents

Assistant#

The Assistant Interface#

FlexRAG Assistants#