Generators

目录

Generators#

class flexrag.models.GeneratorBase[源代码]#
async async_chat(prompts, generation_config=GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=[]))[源代码]#

The async version of chat.

async async_generate(prefixes, generation_config=GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=[]))[源代码]#

The async version of generate.

chat(prompts, generation_config=GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=[]))[源代码]#

chat with the model using model templates.

参数:
  • prompts (list[ChatPrompt] | list[list[dict]] | ChatPrompt | list[dict]) -- A batch of ChatPrompts.

  • generation_config (GenerationConfig) -- GenerationConfig. Defaults to GenerationConfig().

返回:

A batch of chat responses.

返回类型:

list[list[str]]

generate(prefixes, generation_config=GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=[]))[源代码]#

generate text with the model using the given prefixes.

参数:
  • prefixes (list[str] | str) -- A batch of prefixes.

  • generation_config (GenerationConfig) -- GenerationConfig. Defaults to GenerationConfig().

返回:

A batch of generated text.

返回类型:

list[list[str]]

class flexrag.models.GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=<factory>)[源代码]#

Configuration for text generation.

参数:
  • do_sample (bool) -- Whether to use sampling for generation. Defaults to True.

  • sample_num (int) -- The number of samples to generate. Defaults to 1.

  • temperature (float) -- The temperature of the sampling distribution. Defaults to 1.0.

  • max_new_tokens (int) -- The maximum number of tokens to generate. Defaults to 512.

  • top_p (float) -- The cumulative probability for nucleus sampling. Defaults to 0.9.

  • top_k (int) -- The number of tokens to consider for top-k sampling. Defaults to 50.

  • eos_token_id (Optional[int]) -- The token id for the end of sentence token. Defaults to None.

  • stop_str (list[str]) -- A list of strings to stop generation. Defaults to [].

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.GeneratorConfig(generator_type=None, anthropic_config=<factory>, hf_config=<factory>, hf_vlm_config=<factory>, ollama_config=<factory>, openai_config=<factory>, vllm_config=<factory>)#

Configuration class for generator (name: GeneratorConfig, default: None).

参数:

Local Generators#

class flexrag.models.HFModelConfig(model_path=None, tokenizer_path=None, trust_remote_code=False, device_id=<factory>, load_dtype='auto')[源代码]#

The Base Configuration for Huggingface Models, including HFGenerator, HFVLMGenerator, HFEncoder and HFClipEncoder.

参数:
  • model_path (str) -- The path to the model. Required.

  • tokenizer_path (Optional[str]) -- The path to the tokenizer. None for the same as model_path. Default is None.

  • trust_remote_code (bool) -- Whether to trust remote code. Default is False.

  • device_id (list[int]) -- The device id to use. [] for using CPU. Default is [].

  • load_dtype (str) -- The dtype to load the model. Default is "auto". Available choices are "bfloat16", "bf16", "float32", "fp32", "float16", "fp16", "half", "8bit", "4bit", "auto",

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.HFGeneratorConfig(model_path=None, tokenizer_path=None, trust_remote_code=False, device_id=<factory>, load_dtype='auto', pipeline_parallel=False, use_minference=False, model_type='causal_lm')[源代码]#

基类:HFModelConfig

Configuration for HFGenerator.

参数:
  • pipeline_parallel (bool) -- Whether to use pipeline parallel. Default is False.

  • use_minference (bool) -- Whether to use minference for long sequence inference. Default is False.

  • model_type -- The type of the model. Default is "causal_lm". Available choices are "causal_lm", "seq2seq".

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.HFGenerator(cfg)[源代码]#

基类:GeneratorBase

class flexrag.models.OllamaGeneratorConfig(model_name=None, base_url='http://localhost:11434/', verbose=False, num_ctx=4096, allow_parallel=True)[源代码]#

Configuration for the OllamaGenerator.

参数:
  • model_name (str) -- The name of the model to use. Required.

  • base_url (str) -- The base URL of the Ollama server. Default is 'http://localhost:11434/'.

  • verbose (bool) -- Whether to show verbose logs. Default is False.

  • num_ctx (int) -- The number of context tokens to use. Default is 4096.

  • allow_parallel (bool) -- Whether to allow parallel generation. Default is True.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.OllamaGenerator(cfg)[源代码]#

基类:GeneratorBase

class flexrag.models.VLLMGeneratorConfig(model_path=None, gpu_memory_utilization=0.85, max_model_len=16384, tensor_parallel=1, load_dtype='auto', use_minference=False, trust_remote_code=False)[源代码]#

Configuration for VLLMGenerator.

参数:
  • model_path (str) -- Path to the model. Required.

  • gpu_memory_utilization (float) -- Fraction of GPU memory to use. Default to 0.85.

  • max_model_len (int) -- Maximum length of the model. Defaults to 16384.

  • tensor_parallel (int) -- The number of tensor parallel. Defaults to 1.

  • load_dtype (str) -- The dtype to load the model. Defaults to "auto". Available options are "auto", "float32", "float16", "bfloat16".

  • use_minference (bool) -- Whether to use minference for Long Sequence Inference. Defaults to False.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.VLLMGenerator(cfg)[源代码]#

基类:GeneratorBase

Online Generators#

class flexrag.models.AnthropicGeneratorConfig(model_name=None, base_url=None, api_key='EMPTY', verbose=False, proxy=None, allow_parallel=True)[源代码]#

Configuration for AnthropicGenerator.

参数:
  • model_name (str) -- The name of the model. Required.

  • base_url (Optional[str]) -- The base url of the API. Defaults to None.

  • api_key (str) -- The API key. Defaults to os.environ.get("ANTHROPIC_API_KEY", "EMPTY").

  • verbose (bool) -- Whether to output verbose logs. Defaults to False.

  • proxy (Optional[str]) -- The proxy to use. Defaults to None.

  • allow_parallel (bool) -- Whether to allow parallel generation. Defaults to True.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.AnthropicGenerator(cfg)[源代码]#

基类:GeneratorBase

class flexrag.models.OpenAIConfig(is_azure=False, model_name=None, base_url=None, api_key='EMPTY', api_version='2024-07-01-preview', verbose=False, proxy=None)[源代码]#

基类:object

The Base Configuration for OpenAI Client.

参数:
  • is_azure (bool) -- Whether the model is hosted on Azure. Default is False.

  • model_name (str) -- The name of the model to use.

  • base_url (Optional[str]) -- The base URL of the OpenAI API. Default is None.

  • api_key (str) -- The API key for OpenAI. Default is os.environ.get("OPENAI_API_KEY", "EMPTY").

  • api_version (str) -- The API version to use. Default is "2024-07-01-preview".

  • verbose (bool) -- Whether to show verbose logs. Default is False.

  • proxy (Optional[str]) -- The proxy to use for the HTTP client. Default is None.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.OpenAIGeneratorConfig(is_azure=False, model_name=None, base_url=None, api_key='EMPTY', api_version='2024-07-01-preview', verbose=False, proxy=None, allow_parallel=True)[源代码]#

基类:OpenAIConfig

Configuration for OpenAI Generator.

参数:

allow_parallel (bool) -- Whether to allow parallel generation. Default is True.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.OpenAIGenerator(cfg)[源代码]#

基类:GeneratorBase

Visual Language Model Generators#

class flexrag.models.VLMGeneratorBase[源代码]#

基类:GeneratorBase

async async_chat(prompts, generation_config=GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=[]))[源代码]#

The async version of chat.

async async_generate(prefixes, images, generation_config=GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=[]))[源代码]#

The async version of generate.

chat(prompts, generation_config=GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=[]))[源代码]#

chat with the model using model templates.

参数:
返回:

A batch of chat responses.

返回类型:

list[list[str]]

generate(prefixes, images, generation_config=GenerationConfig(do_sample=True, sample_num=1, temperature=1.0, max_new_tokens=512, top_p=0.9, top_k=50, eos_token_id=None, stop_str=[]))[源代码]#

generate text with the model using the given prefixes.

参数:
  • prefixes (list[str]) -- A batch of prefixes.

  • images (list[Image]) -- A batch of images.

  • generation_config (GenerationConfig) -- GenerationConfig. Defaults to GenerationConfig().

返回:

A batch of generated text.

返回类型:

list[list[str]]

class flexrag.models.HFVLMGeneratorConfig(model_path=None, tokenizer_path=None, trust_remote_code=False, device_id=<factory>, load_dtype='auto', pipeline_parallel=False)[源代码]#

基类:HFModelConfig

Configuration for HFVLMGenerator.

参数:

pipeline_parallel (bool) -- Whether to use pipeline parallel. Default is False.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.models.HFVLMGenerator(cfg)[源代码]#

基类:VLMGeneratorBase