docs/how-to-guides/feature-monitoring.md
Feast's data quality monitoring system computes, stores, and serves statistical metrics for every registered feature. It gives you visibility into feature health — distributions, null rates, percentiles, histograms — across batch data and feature serving logs.
This guide covers:
Monitoring works with any supported offline store backend. No additional infrastructure or configuration is needed — monitoring tables are created automatically on first use.
Minimum setup:
pip install feast)For serving log monitoring:
logging_config set (see step 4)When you run feast apply to register new features, Feast automatically queues baseline metric computation:
$ feast apply
Applying changes...
Created feature view 'driver_stats' with 3 features
→ Queued baseline metrics computation (DQM job: abc-123)
Done!
The baseline reads all available source data and stores the resulting statistics with is_baseline=TRUE. This serves as the reference distribution for future drift detection.
Baseline computation is:
feast apply exitsfeast apply won't recompute existing baselinesTo enable automatic baseline computation on feast apply, set the DQM config in feature_store.yaml:
data_quality_monitoring:
auto_baseline: true
When using the Feast operator, set this in the FeatureStore CR:
apiVersion: feast.dev/v1
kind: FeatureStore
spec:
feastProject: my_project
dataQualityMonitoring:
autoBaseline: true
To disable it, set auto_baseline: false (or autoBaseline: false in the CR).
Schedule a single daily job that computes all granularities automatically:
feast monitor run
This detects the latest event timestamp in the source data and computes metrics for 5 time windows:
| Granularity | Window |
|---|---|
daily | Last 1 day |
weekly | Last 7 days |
biweekly | Last 14 days |
monthly | Last 30 days |
quarterly | Last 90 days |
No date arguments needed. One scheduled job produces all granularities.
feast monitor run --feature-view driver_stats
feast monitor run \
--feature-view driver_stats \
--start-date 2025-01-01 \
--end-date 2025-01-07 \
--granularity weekly
feast monitor run \
--feature-view driver_stats \
--start-date 2025-01-01 \
--end-date 2025-03-31 \
--granularity daily \
--set-baseline
Usage: feast monitor run [OPTIONS]
Options:
-p, --project TEXT Feast project name (defaults to feature_store.yaml)
-v, --feature-view TEXT Feature view name (omit for all)
-f, --feature-name TEXT Feature name(s), repeatable (omit for all)
--start-date TEXT Start date YYYY-MM-DD (omit for auto-detect)
--end-date TEXT End date YYYY-MM-DD (omit for auto-detect)
-g, --granularity One of: daily, weekly, biweekly, monthly, quarterly
--set-baseline Mark this computation as baseline
--source-type One of: batch, log, all (default: batch)
--help Show this message and exit.
If your feature services have logging configured, you can compute metrics from the actual features served to models in production.
In your feature definitions:
from feast import FeatureService, LoggingConfig
from feast.infra.offline_stores.contrib.postgres_offline_store.postgres_source import (
PostgreSQLLoggingDestination,
)
driver_service = FeatureService(
name="driver_service",
features=[driver_stats_fv],
logging_config=LoggingConfig(
destination=PostgreSQLLoggingDestination(table_name="feast_driver_logs"),
sample_rate=1.0,
),
)
Auto mode (all feature services with logging):
feast monitor run --source-type log
Specific feature service:
feast monitor run --source-type log --feature-view driver_service
Both batch and log in one run:
feast monitor run --source-type all
Log metrics are stored with data_source_type="log" alongside batch metrics in the same monitoring tables. Feature names from the log schema (e.g., driver_stats__conv_rate) are automatically normalized back to their original names (conv_rate) and associated with the correct feature view — enabling batch-vs-log comparison and drift detection.
# Compute log metrics
POST /monitoring/compute/log
{
"project": "my_project",
"feature_service_name": "driver_service",
"granularity": "daily"
}
# Auto-compute all log metrics
POST /monitoring/auto_compute/log
{
"project": "my_project"
}
All read endpoints support cascading filters: project → feature_service_name → feature_view_name → feature_name → granularity → data_source_type.
GET /monitoring/metrics/features?project=my_project&feature_view_name=driver_stats&granularity=daily
Response:
[
{
"project_id": "my_project",
"feature_view_name": "driver_stats",
"feature_name": "conv_rate",
"feature_type": "numeric",
"metric_date": "2025-03-26",
"granularity": "daily",
"data_source_type": "batch",
"row_count": 15000,
"null_count": 12,
"null_rate": 0.0008,
"mean": 0.523,
"stddev": 0.189,
"min_val": 0.001,
"max_val": 0.998,
"p50": 0.51,
"p75": 0.68,
"p90": 0.82,
"p95": 0.89,
"p99": 0.96,
"histogram": {
"bins": [0.0, 0.05, 0.1, "..."],
"counts": [120, 340, 560, "..."],
"bin_width": 0.05
}
}
]
GET /monitoring/metrics/feature_views?project=my_project&feature_view_name=driver_stats
GET /monitoring/metrics/feature_services?project=my_project&feature_service_name=driver_service
GET /monitoring/metrics/baseline?project=my_project&feature_view_name=driver_stats
GET /monitoring/metrics/timeseries?project=my_project&feature_name=conv_rate&granularity=daily&start_date=2025-01-01&end_date=2025-03-31
Add data_source_type=batch or data_source_type=log to any read endpoint:
GET /monitoring/metrics/features?project=my_project&data_source_type=log
| Method | Endpoint | Description |
|---|---|---|
POST | /monitoring/compute | Submit batch DQM job |
POST | /monitoring/auto_compute | Auto-detect dates, all granularities |
POST | /monitoring/compute/transient | On-demand compute (not stored) |
POST | /monitoring/compute/log | Compute from serving logs |
POST | /monitoring/auto_compute/log | Auto-detect log dates, all granularities |
GET | /monitoring/jobs/{job_id} | DQM job status |
GET | /monitoring/metrics/features | Per-feature metrics |
GET | /monitoring/metrics/feature_views | Per-view aggregates |
GET | /monitoring/metrics/feature_services | Per-service aggregates |
GET | /monitoring/metrics/baseline | Baseline metrics |
GET | /monitoring/metrics/timeseries | Time-series data |
When you need metrics for an arbitrary date range (e.g., "show me the distribution for Jan 5 to Jan 20"), use the transient compute endpoint. It reads source data for the exact range, computes fresh statistics, and returns them directly without storing.
POST /monitoring/compute/transient
{
"project": "my_project",
"feature_view_name": "driver_stats",
"feature_names": ["conv_rate"],
"start_date": "2025-01-05",
"end_date": "2025-01-20"
}
This is necessary because pre-computed histograms from different date ranges have different bin edges and cannot be merged losslessly.
from airflow.operators.bash import BashOperator
monitor_task = BashOperator(
task_id="feast_monitor",
bash_command="feast monitor run",
cwd="/path/to/feast/repo",
)
from kfp import dsl
@dsl.component(base_image="feast-image:latest")
def monitor_features():
import subprocess
subprocess.run(["feast", "monitor", "run"], check=True, cwd="/feast/repo")
# Daily at 2:00 AM UTC
0 2 * * * cd /path/to/feast/repo && feast monitor run >> /var/log/feast-monitor.log 2>&1
feast monitor run --source-type all
Monitoring works natively with all offline stores that serve as compute engines for Feast materialization:
| Backend | Compute | Storage |
|---|---|---|
| PostgreSQL | SQL push-down | INSERT ON CONFLICT |
| Snowflake | SQL push-down | MERGE with VARIANT JSON |
| BigQuery | SQL push-down | MERGE into BQ tables |
| Redshift | SQL push-down | MERGE via Data API |
| Spark | SparkSQL push-down | Parquet tables |
| Oracle | SQL via Ibis | MERGE from DUAL |
| DuckDB | In-memory SQL | Parquet files |
| Dask | PyArrow compute | Parquet files |
Backends not listed above fall back to Python-based computation — the offline store's pull_all_from_table_or_query() returns a PyArrow Table, and metrics are computed using pyarrow.compute and numpy.
Per-feature (full profile):
| Metric | Numeric | Categorical |
|---|---|---|
| row_count, null_count, null_rate | Yes | Yes |
| mean, stddev, min, max | Yes | — |
| p50, p75, p90, p95, p99 | Yes | — |
| histogram (JSONB) | Binned distribution | Top-N values with counts |
Per-feature-view and per-feature-service (aggregate summaries):
| Metric | Description |
|---|---|
| total_row_count | Total rows in the view |
| total_features | Number of features |
| features_with_nulls | Count of features with any nulls |
| avg_null_rate, max_null_rate | Aggregate null rate statistics |
Monitoring respects Feast's existing RBAC:
POST /monitoring/compute, /auto_compute, /compute/log, /auto_compute/log) require AuthzedAction.UPDATEPOST /monitoring/compute/transient) requires AuthzedAction.DESCRIBEGET /monitoring/metrics/*) require AuthzedAction.DESCRIBEThe Feast web UI includes a built-in monitoring dashboard accessible from the Monitoring item in the sidebar navigation.
The monitoring page has three tabs:
| Tab | Shows |
|---|---|
| Features | Per-feature metrics table with null rate, row count, freshness, and health status |
| Feature Views | Aggregated data quality per feature view |
| Feature Services | Aggregated metrics per feature service |
At the top of the monitoring page you can filter by:
Clicking any feature row navigates to a detail page showing:
Click the Compute Metrics button in the page header to trigger an auto_compute job. This computes all granularities for all feature views (or the selected feature view if filtered). Results appear after the table refreshes.
The Refresh button re-fetches already computed metrics from the backend without triggering new computation.
If no metrics have been computed yet, the page shows a prompt:
No monitoring data has been computed for this project. Click "Compute Metrics" to run data quality analysis on your feature views.
If the monitoring backend is unreachable, a warning banner appears:
Could not connect to the monitoring API. Make sure the Feast registry server is running with monitoring enabled.
The monitoring page is always accessible in the sidebar. To see actual data:
Add data_quality_monitoring to your feature_store.yaml:
data_quality_monitoring:
auto_baseline: true
Or, when using the Feast operator, set this in the FeatureStore CR:
apiVersion: feast.dev/v1
kind: FeatureStore
spec:
feastProject: my_project
dataQualityMonitoring:
autoBaseline: true
Run feast apply — this computes baseline metrics automatically
Schedule feast monitor run (or click "Compute Metrics" in the UI) to generate daily/weekly/monthly metrics