examples/eval-rag-full/README.md
You can run this example with:
npx promptfoo@latest init --example eval-rag-full
cd eval-rag-full
This RAG example allows you to ask questions over a number of public company SEC filings. It uses LangChain, but the flow is representative of any RAG solution.
There are 3 parts:
ingest.py: Chunks and loads PDFs into a vector database (PDFs are pulled from a public Google Cloud bucket)
retrieve.py: Promptfoo-compatible provider that answers RAG questions using the database.
promptfooconfig.yaml: Test inputs and requirements.
To get started:
Set the OPENAI_API_KEY environment variable.
Create a python virtual environment: python3 -m venv venv
Enter the environment: source venv/bin/activate
Install python dependencies: pip install -r requirements.txt
Run ingest.py to create the vector database: python ingest.py
Now we're ready to go.
promptfooconfig.yaml to your liking to configure the questions you'd like to ask in your tests. Then run:retrieve.py to control how context is loaded and questions are answered.npx promptfoo@latest eval
Promptfoo is a Node.js CLI, but the file://retrieve.py provider runs inside Python. Keep the virtual environment active when running the eval, or set PROMPTFOO_PYTHON=./venv/bin/python so Promptfoo can import the packages from requirements.txt.
Afterwards, you can view the results by running npx promptfoo@latest view
See promptfooconfig.with-asserts.yaml for a more complete example that compares the performance of two RAG configurations. The smaller retrieval configuration is intentionally expected to miss a couple of details so the comparison view demonstrates failures as well as passes.