[Beta] On Demand Feature Views

Warning: This is an experimental feature. While it is stable to our knowledge, there may still be rough edges in the experience. Contributions are welcome!

Overview

On Demand Feature Views (ODFVs) allow data scientists to use existing features and request-time data to transform and create new features. Users define transformation logic that is executed during both historical and online retrieval. Additionally, ODFVs provide flexibility in applying transformations either during data ingestion (at write time) or during feature retrieval (at read time), controlled via the write_to_online_store parameter.

By setting write_to_online_store=True, transformations are applied during data ingestion, and the transformed features are stored in the online store. This can improve online feature retrieval performance by reducing computation during reads. Conversely, if write_to_online_store=False (the default if omitted), transformations are applied during feature retrieval.

Why Use On Demand Feature Views?

ODFVs enable data scientists to easily impact the online feature retrieval path. For example, a data scientist could:

Call get_historical_features to generate a training dataset.
Iterate in a notebook and do your feature engineering using Pandas or native Python.
Copy transformation logic into ODFVs and commit to a development branch of the feature repository.
Verify with get_historical_features (on a small dataset) that the transformation gives the expected output over historical data.
Decide whether to apply the transformation on writes or on reads by setting the write_to_online_store parameter accordingly.
Verify with get_online_features on the development branch that the transformation correctly outputs online features.
Submit a pull request to the staging or production branches, impacting production traffic.

Transformation Modes

When defining an ODFV, you can specify the transformation mode using the mode parameter. Feast supports the following modes:

Pandas Mode (mode="pandas"): The transformation function takes a Pandas DataFrame as input and returns a Pandas DataFrame as output. This mode is useful for batch transformations over multiple rows.
Native Python Mode (mode="python"): The transformation function uses native Python and can operate on inputs as lists of values or as single dictionaries representing a singleton (single row).

Singleton Transformations in Native Python Mode

Native Python mode supports transformations on singleton dictionaries by setting singleton=True. This allows you to write transformation functions that operate on a single row at a time, making the code more intuitive and aligning with how data scientists typically think about data transformations.

Aggregations

On Demand Feature Views support aggregations that compute aggregate statistics over groups of rows. When using aggregations, data is grouped by entity columns (e.g., driver_id) and aggregated before being passed to the transformation function.

Important: Aggregations and transformations are mutually exclusive. When aggregations are specified, they replace the transformation function.

Usage

python

from feast import Aggregation
from datetime import timedelta

@on_demand_feature_view(
    sources=[driver_hourly_stats_view],
    schema=[
        Field(name="total_trips", dtype=Int64),
        Field(name="avg_rating", dtype=Float64),
    ],
    aggregations=[
        Aggregation(column="trips", function="sum"),
        Aggregation(column="rating", function="mean"),
    ],
)
def driver_aggregated_stats(inputs):
    # No transformation function needed when using aggregations
    pass

Aggregated columns are automatically named using the pattern {function}_{column} (e.g., sum_trips, mean_rating).

Using `input_schema` with Aggregations

When the input data is not already stored as a feature view, use input_schema instead of sources to describe the fields that will be passed at request time. Feast will create an internal RequestSource automatically.

python

from datetime import timedelta
from feast import Field, on_demand_feature_view
from feast.aggregation import Aggregation
from feast.types import Float64, Int64

@on_demand_feature_view(
    input_schema=[
        Field(name="txn_amount", dtype=Float64),
    ],
    schema=[
        Field(name="txn_count", dtype=Int64),
        Field(name="total_txn_amount", dtype=Float64),
        Field(name="avg_txn_amount", dtype=Float64),
    ],
    aggregations=[
        Aggregation(column="txn_amount", function="count", name="txn_count",
                    time_window=timedelta(days=30)),
        Aggregation(column="txn_amount", function="sum", name="total_txn_amount",
                    time_window=timedelta(days=30)),
        Aggregation(column="txn_amount", function="mean", name="avg_txn_amount",
                    time_window=timedelta(days=30)),
    ],
    entities=[user],
)
def user_transaction_stats(inputs):
    # Aggregations replace the transformation function — no body needed.
    pass

input_schema also accepts fields that are not aggregation columns — for example, thresholds, currency codes, or other contextual values passed at request time that your UDF needs but that are not stored as features.

Example

See https://github.com/feast-dev/on-demand-feature-views-demo for an example on how to use on demand feature views.

Registering Transformations

When defining an ODFV, you can control when the transformation is applied using the write_to_online_store parameter:

write_to_online_store=True: The transformation is applied during data ingestion (on write), and the transformed features are stored in the online store.
write_to_online_store=False (default): The transformation is applied during feature retrieval (on read).

Examples

Example 1: On Demand Transformation on Read Using Pandas Mode

python

from feast import Field, RequestSource
from feast.on_demand_feature_view import on_demand_feature_view
from feast.types import Float64, Int64
import pandas as pd

# Define a request data source for request-time features
input_request = RequestSource(
    name="vals_to_add",
    schema=[
        Field(name="val_to_add", dtype=Int64),
        Field(name="val_to_add_2", dtype=Int64),
    ],
)

# Use input data and feature view features to create new features in Pandas mode
@on_demand_feature_view(
    sources=[driver_hourly_stats_view, input_request],
    schema=[
        Field(name="conv_rate_plus_val1", dtype=Float64),
        Field(name="conv_rate_plus_val2", dtype=Float64),
    ],
    mode="pandas",
)
def transformed_conv_rate(features_df: pd.DataFrame) -> pd.DataFrame:
    df = pd.DataFrame()
    df["conv_rate_plus_val1"] = features_df["conv_rate"] + features_df["val_to_add"]
    df["conv_rate_plus_val2"] = features_df["conv_rate"] + features_df["val_to_add_2"]
    return df

Example 2: On Demand Transformation on Read Using Native Python Mode (List Inputs)

python

from feast import Field, on_demand_feature_view
from feast.types import Float64
from typing import Any, Dict

# Use input data and feature view features to create new features in Native Python mode
@on_demand_feature_view(
    sources=[driver_hourly_stats_view, input_request],
    schema=[
        Field(name="conv_rate_plus_val1_python", dtype=Float64),
        Field(name="conv_rate_plus_val2_python", dtype=Float64),
    ],
    mode="python",
)
def transformed_conv_rate_python(inputs: Dict[str, Any]) -> Dict[str, Any]:
    output = {
        "conv_rate_plus_val1_python": [
            conv_rate + val_to_add
            for conv_rate, val_to_add in zip(inputs["conv_rate"], inputs["val_to_add"])
        ],
        "conv_rate_plus_val2_python": [
            conv_rate + val_to_add
            for conv_rate, val_to_add in zip(
                inputs["conv_rate"], inputs["val_to_add_2"]
            )
        ],
    }
    return output

New Example 3: On Demand Transformation on Read Using Native Python Mode (Singleton Input)

python

from feast import Field, on_demand_feature_view
from feast.types import Float64
from typing import Any, Dict

# Use input data and feature view features to create new features in Native Python mode with singleton input
@on_demand_feature_view(
    sources=[driver_hourly_stats_view, input_request],
    schema=[
        Field(name="conv_rate_plus_acc_singleton", dtype=Float64),
    ],
    mode="python",
    singleton=True,
)
def transformed_conv_rate_singleton(inputs: Dict[str, Any]) -> Dict[str, Any]:
    output = {
        "conv_rate_plus_acc_singleton": inputs["conv_rate"] + inputs["acc_rate"]
    }
    return output

In this example, inputs is a dictionary representing a single row, and the transformation function returns a dictionary of transformed features for that single row. This approach is more intuitive and aligns with how data scientists typically process single data records.

Example 4: On Demand Transformation on Write Using Pandas Mode

python

from feast import Field, on_demand_feature_view
from feast.types import Float64
import pandas as pd

# Existing Feature View
driver_hourly_stats_view = ...

# Define an ODFV applying transformation during write time
@on_demand_feature_view(
    sources=[driver_hourly_stats_view],
    schema=[
        Field(name="conv_rate_adjusted", dtype=Float64),
    ],
    mode="pandas",
    write_to_online_store=True,  # Apply transformation during write time
)
def transformed_conv_rate(features_df: pd.DataFrame) -> pd.DataFrame:
    df = pd.DataFrame()
    df["conv_rate_adjusted"] = features_df["conv_rate"] * 1.1  # Adjust conv_rate by 10%
    return df

To ingest data with the new feature view, include all input features required for the transformations:

python

from feast import FeatureStore
import pandas as pd

store = FeatureStore(repo_path=".")

# Data to ingest
data = pd.DataFrame({
    "driver_id": [1001],
    "event_timestamp": [pd.Timestamp.now()],
    "conv_rate": [0.5],
    "acc_rate": [0.8],
    "avg_daily_trips": [10],
})

# Ingest data to the online store
store.push("driver_hourly_stats_view", data)

Feature Retrieval

{% hint style="info" %} Note: The name of the on demand feature view is the function name (e.g., transformed_conv_rate). {% endhint %}

Offline Features

Retrieve historical features by referencing individual features or using a feature service:

python

training_df = store.get_historical_features(
    entity_df=entity_df,
    features=[
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "driver_hourly_stats:avg_daily_trips",
        "transformed_conv_rate:conv_rate_plus_val1",
        "transformed_conv_rate:conv_rate_plus_val2",
        "transformed_conv_rate_singleton:conv_rate_plus_acc_singleton",
    ],
).to_df()

Online Features

Retrieve online features by referencing individual features or using a feature service:

python

entity_rows = [
    {
        "driver_id": 1001,
        "val_to_add": 1,
        "val_to_add_2": 2,
    }
]

online_response = store.get_online_features(
    entity_rows=entity_rows,
    features=[
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "transformed_conv_rate_python:conv_rate_plus_val1_python",
        "transformed_conv_rate_python:conv_rate_plus_val2_python",
        "transformed_conv_rate_singleton:conv_rate_plus_acc_singleton",
    ],
).to_dict()

Materializing Pre-transformed Data

In some scenarios, you may have already transformed your data in batch (e.g., using Spark or another batch processing framework) and want to directly materialize the pre-transformed features without applying transformations during ingestion. Feast supports this through the transform_on_write parameter.

When using write_to_online_store=True with On Demand Feature Views, you can set transform_on_write=False to skip transformations during the write operation. This is particularly useful for optimizing performance when working with large pre-transformed datasets.

python

from feast import FeatureStore
import pandas as pd

store = FeatureStore(repo_path=".")

# Pre-transformed data (transformations already applied)
pre_transformed_data = pd.DataFrame({
    "driver_id": [1001],
    "event_timestamp": [pd.Timestamp.now()],
    "conv_rate": [0.5],
    # Pre-calculated values for the transformed features
    "conv_rate_adjusted": [0.55],  # Already contains the adjusted value
})

# Write to online store, skipping transformations
store.write_to_online_store(
    feature_view_name="transformed_conv_rate",
    df=pre_transformed_data,
    transform_on_write=False  # Skip transformation during write
)

This approach allows for a hybrid workflow where you can:

Transform data in batch using powerful distributed processing tools
Materialize the pre-transformed data without reapplying transformations
Still use the Feature Server to execute transformations during API calls when needed

Even when features are materialized with transformations skipped (transform_on_write=False), the feature server can still apply transformations during API calls for any missing values or for features that require real-time computation.

CLI Commands

There are new CLI commands to manage on demand feature views:

feast on-demand-feature-views list: Lists all registered on demand feature views after feast apply is run. feast on-demand-feature-views describe [NAME]: Describes the definition of an on demand feature view.

Troubleshooting

Validation Issues with Complex Transformations

When defining On Demand Feature Views with complex transformations, you may encounter validation errors during feast apply. Feast validates ODFVs by constructing random inputs and running the transformation function to infer the output schema. This validation can sometimes be overly strict or fail for complex transformation logic.

If you encounter validation errors that you believe are incorrect, you can skip feature view validation:

Python SDK:

python

from feast import FeatureStore

store = FeatureStore(repo_path=".")
store.apply([my_odfv], skip_feature_view_validation=True)

CLI:

bash

feast apply --skip-feature-view-validation

{% hint style="warning" %} Skipping validation bypasses important checks. Use this option only when the validation system is being overly strict. We encourage you to report validation issues on the Feast GitHub repository so the team can improve the validation system. {% endhint %}

When to use skip_feature_view_validation:

Your ODFV transformation uses complex logic that fails the random input validation
You've verified your transformation works correctly with real data
The validation error doesn't reflect an actual issue with your transformation

What validation is skipped:

Feature view name uniqueness checks
ODFV transformation validation via _construct_random_input()
Schema inference for features

What is NOT skipped:

Data source validation (use --skip-source-validation to skip)
Registry operations
Infrastructure updates

[Beta] On Demand Feature Views

[Beta] On Demand Feature Views

Overview

Why Use On Demand Feature Views?

Transformation Modes

Singleton Transformations in Native Python Mode

Aggregations

Usage

Using input_schema with Aggregations

Example

Registering Transformations

Examples

Example 1: On Demand Transformation on Read Using Pandas Mode

Example 2: On Demand Transformation on Read Using Native Python Mode (List Inputs)

New Example 3: On Demand Transformation on Read Using Native Python Mode (Singleton Input)

Example 4: On Demand Transformation on Write Using Pandas Mode

Feature Retrieval

Offline Features

Online Features

Materializing Pre-transformed Data

CLI Commands

Troubleshooting

Validation Issues with Complex Transformations

Using `input_schema` with Aggregations