docs/adr/ADR-0004-entity-join-key-mapping.md
Accepted
Multiple different entity keys in the source data may need to map onto the same entity from the feature data table during a join. For example, spammer_id and reporter_id may both need the years_on_platform feature from a table keyed by user_id.
Without entity join key mapping:
spammer_id and reporter_id in the same query).Entity source data:
| spammer_id | reporter_id | timestamp |
|---|---|---|
| 2 | 8 | 1629909366 |
| 1 | 2 | 1629909323 |
Desired joined data should include spammer_feature_a and reporter_feature_a, both sourced from the same user feature view but joined on different keys.
Implement join key overrides using a with_join_key_map() method on feature views, combined with with_name() for disambiguation. This was Option 8b from the RFC.
abuse_feature_service = FeatureService(
name="my_abuse_model_v1",
features=[
user_features
.with_name("reporter_features")
.with_join_key_map({"user_id": "reporter_id"}),
user_features
.with_name("spammer_features")
.with_join_key_map({"user_id": "spammer_id"}),
],
)
with_name() required when using the same feature view multiple times to avoid output column name collisions. If omitted, a name collision error is raised.join_key_map is stored on FeatureViewProjection and flows through both online and offline retrieval paths.with_name() to avoid collisions when joining the same feature view multiple times.sdk/python/feast/feature_view.py (with_join_key_map method), sdk/python/feast/feature_view_projection.py (join_key_map field)