Utils#

This module contains useful functions that are used throughout the codebase.

Cache#

class flexrag.utils.persistent_cache.FIFOPersistentCache(maxsize=None, cache_path=None)[source]#
popitem()[source]#

This method should be implemented by subclasses.

class flexrag.utils.persistent_cache.LFUPersistentCache(maxsize=None, cache_path=None)[source]#

The LFUPersistentCache evicts the least frequently used item from the cache when the cache is full.

This implementation employs a Counter to keep track of the frequency of access. However, the frequency will not be persisted to disk. Thus, the frequency will be reset when the cache is loaded from disk.

popitem()[source]#

This method should be implemented by subclasses.

class flexrag.utils.persistent_cache.LRUPersistentCache(maxsize=None, cache_path=None)[source]#

The LRUPersistentCache evicts the least recently used item from the cache when the cache is full.

This implementation employs an OrderedDict to keep track of the order of access. However, the order will not be persisted to disk. Thus, the order will be reset when the cache is loaded from disk.

popitem()[source]#

This method should be implemented by subclasses.

class flexrag.utils.persistent_cache.PersistentCacheBase(maxsize=None, cache_path=None)[source]#

The base class for PersistentCache.

The PersistentCache is a cache that can be persisted to disk, and provide a simple interface like a dictionary. The subclasses should implement the popitem method, which decides which item to evict from the cache when the cache is full.

cache(func)[source]#

Decorator to cache the result of a function. The arguments of the function should be hashable.

For example:

from flexrag.utils import LRUPersistentCache

cache = LRUPersistentCache()

@cache.cache
def expensive_function(x):
    # Some expensive computation
    return x * 2
abstract popitem()[source]#

This method should be implemented by subclasses.

reduce_size(size=None)[source]#

Reduce the size of the cache to the specified size.

param size: The size to reduce to. If None, use the self.maxsize. type size: int return: None rtype: None

class flexrag.utils.persistent_cache.RandomPersistentCache(maxsize=None, cache_path=None)[source]#

The RandomPersistentCache evicts a random item from the cache when the cache is full.

In this implementation, the evict order is determined by the __iter__ method of the backend.

popitem()[source]#

This method should be implemented by subclasses.

Other Utils#

class flexrag.utils.Register(register_name=None, allow_load_from_repo=False)[source]#
get(key, default=None)[source]#

Get the item dict by name.

Parameters:
  • key (str) – The name of the item.

  • default (Any) – The default value to return, defaults to None.

Returns:

The item dict containing the item, main_name, short_names, and config_class.

Return type:

dict

get_item(key)[source]#

Get the item by name.

Parameters:

key (str) – The name of the item.

Returns:

The item.

Return type:

Any

load(config, **kwargs)[source]#

Load the item(s) from the generated config.

Parameters:
  • config (DictConfig) – The config generated by make_config method.

  • kwargs (Any) – The additional arguments to pass to the item(s).

Raises:

ValueError – If the item type is invalid.

Returns:

The loaded item(s).

Return type:

RegistedType | list[RegistedType]

property mainnames#

Get the main names of the registered items.

make_config(allow_multiple=False, default=None, config_name=None)[source]#

Make a config class for the registered items.

Parameters:
  • allow_multiple (bool, optional) – Whether to allow multiple items to be selected, defaults to False.

  • default (Optional[str], optional) – The default item to select, defaults to None.

  • config_name (str, optional) – The name of the config class, defaults to None.

Returns:

The config class.

Return type:

dataclass

property names#

Get the names of the registered items.

property shortnames#

Get the short names of the registered items.

squeeze(config_instance)[source]#

Convert the nused fields to None.

Common Dataclass#

This module provides several pre-defined dataclasses that are commonly used in the project.

class flexrag.utils.dataclasses.Context(context_id=None, data=<factory>, source=None, meta_data=<factory>)[source]#

The dataclass for retrieved context.

Parameters:
  • context_id (Optional[str]) – The unique identifier of the context. Default: None.

  • data (dict) – The context data. Default: {}.

  • source (Optional[str]) – The source of the retrieved data. Default: None.

  • meta_data (dict) – The metadata of the context. Default: {}.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.

class flexrag.utils.dataclasses.RetrievedContext(context_id=None, data=<factory>, source=None, meta_data=<factory>, retriever='', query='', score=0.0)[source]#

The dataclass for retrieved context.

Parameters:
  • retriever (str) – The name of the retriever. Required.

  • query (str) – The query for retrieval. Required.

  • score (float) – The relevance score of the retrieved data. Default: 0.0.

dump(path)#

Dump the dataclass to a YAML file.

dumps()#

Dump the dataclass to a YAML string.

classmethod load(path)#

Load the dataclass from a YAML file.

classmethod loads(s)#

Load the dataclass from a YAML string.