infra/website/docs/blog/feast-openlineage-integration.md
Feast already provides built-in lineage tracking through its native UI. When you explore your feature store in the Feast UI, you can visualize relationships between data sources, entities, feature views, and feature services—all without any additional configuration.
<div class="content-image"> </div>This native lineage view shows:
While Feast's native lineage is powerful for understanding your feature store, modern ML systems span many tools—data pipelines, training jobs, model registries, and serving infrastructure. OpenLineage is the open standard that connects lineage across all these systems.
We are excited to announce that Feast now supports native integration with OpenLineage, enabling you to:
With this integration, Feast automatically tracks and emits lineage events whenever you apply feature definitions or materialize features—no code changes required. Simply enable OpenLineage in your feature_store.yaml, and Feast handles the rest.
Feature stores manage the lifecycle of ML features, from raw data sources to model inference. As ML systems grow in complexity, teams often struggle to answer fundamental questions:
OpenLineage solves these challenges by providing a standardized way to capture and visualize data lineage. By integrating OpenLineage into Feast, ML teams gain automatic visibility into their feature engineering pipelines without manual instrumentation.
The integration automatically emits OpenLineage events for two key operations:
feast apply)When you run feast apply, Feast creates a lineage graph that mirrors what you see in the Feast UI:
DataSources ──┐
├──→ feast_feature_views_{project} ──→ FeatureViews
Entities ─────┘ │
│
▼
feature_service_{name} ──→ FeatureService
This creates two types of jobs:
feast_feature_views_{project}: Shows how DataSources and Entities flow into FeatureViewsfeature_service_{name}: Shows which FeatureViews compose each FeatureServicefeast materialize)When materializing features, Feast emits START, COMPLETE, and FAIL events, allowing you to track:
pip install feast[openlineage]
Add the openlineage section to your feature_store.yaml:
project: my_fraud_detection
registry: data/registry.db
provider: local
online_store:
type: sqlite
path: data/online_store.db
openlineage:
enabled: true
transport_type: http
transport_url: http://localhost:5000
namespace: feast
Marquez is the reference implementation for OpenLineage and provides a beautiful UI for exploring lineage:
docker run -p 5000:5000 -p 3000:3000 marquezproject/marquez
from feast import FeatureStore
fs = FeatureStore(repo_path="feature_repo")
# This automatically emits lineage events!
fs.apply([
driver_entity,
driver_stats_source,
driver_hourly_stats_view,
driver_stats_service
])
Visit http://localhost:3000 to see your lineage graph in Marquez!
The integration doesn't just track relationships—it captures comprehensive metadata about your Feast objects:
Feature Views
Feature Services
Data Sources
All this metadata is attached as OpenLineage facets, making it queryable and explorable in any OpenLineage-compatible tool.
We've included a complete working example in the Feast repository that demonstrates the OpenLineage integration end-to-end. The example creates a driver statistics feature store and shows how lineage events are automatically emitted.
Run the example:
# Start Marquez first
docker run -p 5000:5000 -p 3000:3000 marquezproject/marquez
# Clone and run the example
cd feast/examples/openlineage-integration
python openlineage_demo.py --url http://localhost:5000
# View lineage at http://localhost:3000
The example demonstrates:
feast applyIn Marquez, you'll see the complete lineage graph:
driver_stats_source (DataSource) → driver_hourly_stats (FeatureView)driver_id (Entity) → driver_hourly_stats (FeatureView)driver_hourly_stats (FeatureView) → driver_stats_service (FeatureService)Check out the full example code for complete details including feature definitions with descriptions and tags.
Debugging Made Easy
When a model's predictions degrade, trace back through the lineage to identify which data source or feature transformation changed.
Impact Analysis
Before modifying a data source, understand all downstream feature views and services that will be affected.
Compliance & Audit
Maintain a complete audit trail of data flow for regulatory requirements like GDPR, CCPA, or SOC2.
Documentation
Auto-generated lineage serves as living documentation that stays in sync with your actual feature store configuration.
Cross-Team Collaboration
Data engineers, ML engineers, and data scientists can all view the same lineage graph to understand the feature store structure.
This integration is available now in the latest version of Feast. To get started:
We're excited to see how teams use OpenLineage integration to improve their ML operations and welcome feedback from the community!