infra/website/docs/blog/feature-transformation-latency.md
Thank you to Shuchu Han and Francisco Javier Arceo for their contributions evaluating the latency of transformation for On Demand Feature Views and adding transformations on writes to On Demand Feature Views. Thank you also to Maks Stachowiak, Ross Briden, Ankit Nadig, and the folks at Affirm for inspiring this work and creating an initial proof of concept.
Feature engineering is at the core of building high-performance machine learning models. The Feast team has introduced two major enhancements to On Demand Feature Views (ODFVs), pushing the boundaries of efficiency and flexibility for data scientists and engineers. Here's a closer look at these exciting updates:
Traditionally, transformations in ODFVs were limited to Pandas-based operations. While powerful, Pandas transformations can be computationally expensive for certain use cases. Feast now introduces Native Python Mode, a feature that allows users to write transformations using pure Python.
Key benefits of Native Python Mode include:
Here's an example of using Native Python Mode for singleton transformations:
@on_demand_feature_view(
sources=[driver_hourly_stats_view, input_request],
schema=[Field(name="conv_rate_plus_acc_singleton", dtype=Float64)],
mode="python",
singleton=True,
)
def transformed_conv_rate_singleton(inputs: Dict[str, Any]) -> Dict[str, Any]:
return {"conv_rate_plus_acc_singleton": inputs["conv_rate"] + inputs["acc_rate"]}
This approach aligns with how many data scientists naturally process data, simplifying the implementation of feature engineering workflows.
Until now, ODFVs operated solely as transformations on reads, applying logic during online feature retrieval. While this ensured flexibility, it sometimes came at the cost of increased latency during retrieval. Feast now supports transformations on writes, enabling users to apply transformations during data ingestion and store the transformed features in the online store.
Why does this matter?
write_to_online_store parameter, users can choose whether transformations should occur at write time (to optimize reads) or at read time (to preserve data freshness).Here's an example of applying transformations during ingestion:
@on_demand_feature_view(
sources=[driver_hourly_stats_view],
schema=[Field(name="conv_rate_adjusted", dtype=Float64)],
mode="pandas",
write_to_online_store=True, # Apply transformation during write time
)
def transformed_conv_rate(features_df: pd.DataFrame) -> pd.DataFrame:
df = pd.DataFrame()
df["conv_rate_adjusted"] = features_df["conv_rate"] * 1.1
return df
With this new capability, data engineers can optimize online retrieval performance without sacrificing the flexibility of on-demand transformations.
These enhancements bring ODFVs closer to the goal of seamless feature engineering at scale. By combining high-speed Python-based transformations with the ability to optimize retrieval latency, Feast empowers teams to build more efficient, responsive, and production-ready feature pipelines.
For more detailed examples and use cases, check out the documentation for On Demand Feature Views. Whether you're a data scientist prototyping features or an engineer optimizing a production system, the new ODFV capabilities offer the tools you need to succeed.
The future of Feature Transformations in Feast will be to unify feature transformations and feature views to allow for a simpler API. If you have thoughts or interest in giving feedback to the maintainers, feel free to comment directly on the GitHub Issue or in the RFC.
Our goal is to make Feast blazing fast for feature transformation and serving.
Are you ready to unlock the full potential of On Demand Feature Views? Update your Feast repository and start building today!