examples/rag/README.md
This project demonstrates how to use Feast to power a Retrieval-Augmented Generation (RAG) application. The RAG architecture combines retrieval of documents (using vector search) with In-Context-Learning (ICL) through a Large Language Model (LLM) to answer user questions accurately using structured and unstructured data.
data/: Contains the demo data, including Wikipedia summaries of cities with sentence embeddings stored in a Parquet file.example_repo.py: Defines the feature views and entity configurations for Feast.feature_store.yaml: Configures the offline and online stores (using local files and Milvus Lite in this demo).test_workflow.py: Demonstrates key Feast commands to define, retrieve, and push features.Install the necessary packages:
pip install feast torch transformers openai
Initialize and inspect the feature store:
feast apply
Materialize features into the online store:
store.write_to_online_store(feature_view_name='city_embeddings', df=df)
Run a query:
question = "Which city has the largest population in New York?"feast apply
store.write_to_online_store(feature_view_name='city_embeddings', df=df)
context_data = store.retrieve_online_documents_v2(
features=[
"city_embeddings:vector",
"city_embeddings:item_id",
"city_embeddings:state",
"city_embeddings:sentence_chunks",
"city_embeddings:wiki_summary",
],
query=query,
top_k=3,
distance_metric='COSINE',
).to_df()
display(context_data)
π Example Output When querying: Which city has the largest population in New York?
The model provides:
The largest city in New York is New York City, often referred to as NYC. It is the most populous city in the United States, with an estimated population of 8,335,897 in 2022.