Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.

Simple self-RAG short form pack

This LlamaPack implements (*in short form) the self-RAG paper by Akari et al..

This paper presents a novel framework called Self-Reflective Retrieval-Augmented Generation (SELF-RAG). Which aims to enhance the quality and factuality of large language models (LLMs) by combining retrieval and self-reflection mechanisms.

The implementation is adapted from the author implementation A full notebook guide can be found here.

CLI Usage

You can download llamapacks directly using llamaindex-cli, which comes installed with the llama-index python package:

llamaindex-cli download-llamapack SelfRAGPack --download-dir ./self_rag_pack

You can then inspect the files at ./self_rag_pack and use them as a template for your own project!

Code Usage

We will show you how to import the agent from these files! The implementation uses llama-cpp, to download the relevant models (be sure to replace DIR_PATH)

pip3 install -q huggingface-hub
huggingface-cli download m4r1/selfrag_llama2_7b-GGUF selfrag_llama2_7b.q4_k_m.gguf --local-dir "<DIR_PATH>" --local-dir-use-symlinks False
from llama_index.llama_pack import download_llama_pack

# download and install dependencies
SelfRAGPack = download_llama_pack("SelfRAGPack", "./self_rag_pack")

From here, you can use the pack. You can import the relevant modules from the download folder (in the example below we assume it's a relative import or the directory has been added to your system path).

from self_rag_pack.base import SelfRAGQueryEngine

query_engine = SelfRAGQueryEngine(
    model_path=model_path, retriever=retriever, verbose=True
)

response = query_engine.query(
    "Who won best Director in the 1972 Academy Awards?"
)

You can also use/initialize the pack directly.

from llm_compiler_agent_pack.base import SelfRAGPack

agent_pack = SelfRAGPack(
    model_path=model_path, retriever=retriever, verbose=True
)

The run() function is a light wrapper around agent.chat().

response = pack.run("Who won best Director in the 1972 Academy Awards?")