docs/getting-started/genai.md
Feast provides robust support for Generative AI applications, enabling teams to build, deploy, and manage feature infrastructure for Large Language Models (LLMs) and other Generative AI (GenAI) applications. With Feast's vector database integrations and feature management capabilities, teams can implement production-ready Retrieval Augmented Generation (RAG) systems and other GenAI applications with the same reliability and operational excellence as traditional ML systems.
Feast integrates with popular vector databases to store and retrieve embedding vectors efficiently:
retrieve_online_documents_v2 methodThese integrations allow you to:
Feast simplifies building RAG applications by providing:
The typical RAG workflow with Feast involves:
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Document │ │ Document │ │ Feast │ │ LLM │
│ Processing │────▶│ Embedding │────▶│ Feature │────▶│ Context │
│ │ │ │ │ Store │ │ Generation │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
Feast provides powerful capabilities for transforming unstructured data (like PDFs, text documents, and images) into structured embeddings that can be used for RAG applications:
@on_demand_feature_view decorator to transform raw documents into embeddings in real-timeThe transformation workflow typically involves:
The DocEmbedder class provides an end-to-end pipeline for ingesting documents into Feast's online vector store. It handles chunking, embedding generation, and writing results -- all in a single step.
DocEmbedder: High-level orchestrator that runs the full pipeline: chunk → embed → schema transform → write to online storeBaseChunker / TextChunker: Pluggable chunking layer. TextChunker splits text by word count with configurable chunk_size, chunk_overlap, min_chunk_size, and max_chunk_charsBaseEmbedder / MultiModalEmbedder: Pluggable embedding layer with modality routing. MultiModalEmbedder supports text (via sentence-transformers) and image (via CLIP) with lazy model loadingSchemaTransformFn: A user-defined function that transforms the chunked + embedded DataFrame into the format expected by the FeatureView schemafrom feast import DocEmbedder
import pandas as pd
# Prepare your documents
df = pd.DataFrame({
"id": ["doc1", "doc2"],
"text": ["First document content...", "Second document content..."],
})
# Create DocEmbedder -- automatically generates a FeatureView and applies the repo
embedder = DocEmbedder(
repo_path="feature_repo/",
feature_view_name="text_feature_view",
)
# Embed and ingest documents in one step
result = embedder.embed_documents(
documents=df,
id_column="id",
source_column="text",
column_mapping=("text", "text_embedding"),
)
feast applySchemaTransformFn to control how chunked + embedded data maps to your FeatureView schemaBaseChunker or BaseEmbedder to plug in your own chunking or embedding strategiesFor a complete walkthrough, see the DocEmbedder tutorial notebook.
Feast supports transformations that can be used to:
Build document Q&A systems by:
Enhance your LLM's knowledge by:
Implement semantic search by:
Feast can serve as both the context provider and persistent memory layer for AI agents. Unlike stateless RAG pipelines, agents make autonomous decisions about which tools to call and can write state back to the feature store:
write_to_online_storeWith MCP enabled, agents built with any framework (LangChain, LlamaIndex, CrewAI, AutoGen, or custom) can discover and call Feast tools dynamically. See the Feast-Powered AI Agent example and the blog post Building AI Agents with Feast for a complete walkthrough.
Feast integrates with Apache Spark to enable large-scale processing of unstructured data for GenAI applications:
This integration enables:
Feast integrates with Ray to enable distributed processing for RAG applications:
This integration enables:
For detailed information on building distributed RAG applications with Feast and Ray, see Feast + Ray: Distributed Processing for RAG Applications.
Feast supports the Model Context Protocol (MCP), which enables AI agents and applications to interact with your feature store through standardized MCP interfaces. This allows seamless integration with LLMs and AI agents for GenAI applications.
pip install feast[mcp]Install MCP support:
pip install feast[mcp]
Configure your feature store to use MCP:
feature_server:
type: mcp
enabled: true
mcp_enabled: true
mcp_transport: http
mcp_server_name: "feast-feature-store"
mcp_server_version: "1.0.0"
By default, Feast uses the SSE-based MCP transport (mcp_transport: sse). Streamable HTTP (mcp_transport: http) is recommended for improved compatibility with some MCP clients.
The MCP integration uses the fastapi_mcp library to automatically transform your Feast feature server's FastAPI endpoints into MCP-compatible tools. When you enable MCP support:
The fastapi_mcp integration automatically exposes your Feast feature server's FastAPI endpoints as MCP tools. This means AI assistants can:
/get-online-features to retrieve features from the feature store/retrieve-online-documents to perform vector similarity search/write-to-online-store to persist agent state (memory, notes, interaction history)/health to check server statusFor a basic MCP example, see the MCP Feature Store Example. For a full agent with persistent memory, see the Feast-Powered AI Agent Example.
For more detailed information and examples: