docs/reference/openlineage.md
This module provides native integration between Feast and OpenLineage, enabling automatic data lineage tracking for ML feature engineering workflows.
When enabled, the integration automatically emits OpenLineage events for:
No code changes required - just enable OpenLineage in your feature_store.yaml!
OpenLineage is an optional dependency. Install it with:
pip install openlineage-python
Or install Feast with the OpenLineage extra:
pip install feast[openlineage]
Add the openlineage section to your feature_store.yaml:
project: my_project
registry: data/registry.db
provider: local
online_store:
type: sqlite
path: data/online_store.db
openlineage:
enabled: true
transport_type: http
transport_url: http://localhost:5000
transport_endpoint: api/v1/lineage
namespace: feast
emit_on_apply: true
emit_on_materialize: true
Once configured, all Feast operations will automatically emit lineage events.
You can also configure via environment variables:
export FEAST_OPENLINEAGE_ENABLED=true
export FEAST_OPENLINEAGE_TRANSPORT_TYPE=http
export FEAST_OPENLINEAGE_URL=http://localhost:5000
export FEAST_OPENLINEAGE_ENDPOINT=api/v1/lineage
export FEAST_OPENLINEAGE_NAMESPACE=feast
Once configured, lineage is tracked automatically:
from feast import FeatureStore
from datetime import datetime, timedelta
# Create FeatureStore - OpenLineage is initialized automatically if configured
fs = FeatureStore(repo_path="feature_repo")
# Apply operations emit lineage events automatically
fs.apply([driver_entity, driver_hourly_stats_view])
# Materialize emits START, COMPLETE/FAIL events automatically
fs.materialize(
start_date=datetime.now() - timedelta(days=1),
end_date=datetime.now()
)
| Option | Default | Description |
|---|---|---|
enabled | false | Enable/disable OpenLineage integration |
transport_type | http | Transport type: http, file, kafka |
transport_url | - | URL for HTTP transport (required) |
transport_endpoint | api/v1/lineage | API endpoint for HTTP transport |
api_key | - | Optional API key for authentication |
namespace | feast | Namespace for lineage events (uses project name if set to "feast") |
producer | feast | Producer identifier |
emit_on_apply | true | Emit events on feast apply |
emit_on_materialize | true | Emit events on materialization |
When you run feast apply, Feast creates a lineage graph that matches the Feast UI:
DataSources ──┐
├──→ feast_feature_views_{project} ──→ FeatureViews
Entities ─────┘ │
│
▼
feature_service_{name} ──→ FeatureService
Jobs created:
feast_feature_views_{project}: Shows DataSources + Entities → FeatureViewsfeature_service_{name}: Shows specific FeatureViews → FeatureService (one per service)Datasets include:
openlineage:
enabled: true
transport_type: http
transport_url: http://marquez:5000
transport_endpoint: api/v1/lineage
api_key: your-api-key # Optional
openlineage:
enabled: true
transport_type: file
additional_config:
log_file_path: openlineage_events.json
openlineage:
enabled: true
transport_type: kafka
additional_config:
bootstrap_servers: localhost:9092
topic: openlineage.events
The integration includes custom Feast-specific facets in lineage events:
Captures metadata about feature views:
name: Feature view namettl_seconds: Time-to-live in secondsentities: List of entity namesfeatures: List of feature namesonline_enabled / offline_enabled: Store configurationdescription: Feature view descriptiontags: Key-value tagsCaptures metadata about feature services:
name: Feature service namefeature_views: List of feature view namesfeature_count: Total number of featuresdescription: Feature service descriptiontags: Key-value tagsCaptures materialization run metadata:
feature_views: Feature views being materializedstart_date / end_date: Materialization windowrows_written: Number of rows writtenUse Marquez to visualize your Feast lineage:
# Start Marquez
docker run -p 5000:5000 -p 3000:3000 marquezproject/marquez
# Configure Feast to emit to Marquez (in feature_store.yaml)
# openlineage:
# enabled: true
# transport_type: http
# transport_url: http://localhost:5000
Then access the Marquez UI at http://localhost:3000 to see your feature lineage.
namespace is set to "feast" (default): Uses project name as namespace (e.g., my_project)namespace is set to a custom value: Uses {namespace}/{project} (e.g., custom/my_project)| Feast Concept | OpenLineage Concept |
|---|---|
| DataSource | InputDataset |
| FeatureView | OutputDataset (of feature views job) / InputDataset (of feature service job) |
| Feature | Schema field |
| Entity | InputDataset |
| FeatureService | OutputDataset |
| Materialization | RunEvent (START/COMPLETE/FAIL) |