docs/how-to-guides/adding-or-reusing-tests.md
This guide will go over:
Unit tests are contained in sdk/python/tests/unit.
Integration tests are contained in sdk/python/tests/integration.
Let's inspect the structure of sdk/python/tests/integration:
$ tree
.
├── e2e
│ ├── test_go_feature_server.py
│ ├── test_python_feature_server.py
│ ├── test_universal_e2e.py
│ └── test_validation.py
├── feature_repos
│ ├── integration_test_repo_config.py
│ ├── repo_configuration.py
│ └── universal
│ ├── catalog
│ ├── data_source_creator.py
│ ├── data_sources
│ │ ├── __init__.py
│ │ ├── bigquery.py
│ │ ├── file.py
│ │ ├── redshift.py
│ │ └── snowflake.py
│ ├── entities.py
│ ├── feature_views.py
│ ├── online_store
│ │ ├── __init__.py
│ │ ├── datastore.py
│ │ ├── dynamodb.py
│ │ ├── hbase.py
│ │ └── redis.py
│ └── online_store_creator.py
├── materialization
│ └── test_lambda.py
├── offline_store
│ ├── test_feature_logging.py
│ ├── test_offline_write.py
│ ├── test_push_features_to_offline_store.py
│ ├── test_s3_custom_endpoint.py
│ └── test_universal_historical_retrieval.py
├── online_store
│ ├── test_push_features_to_online_store.py
│ └── test_universal_online.py
└── registration
├── test_feature_store.py
├── test_inference.py
├── test_registry.py
├── test_universal_cli.py
├── test_universal_odfv_feature_inference.py
└── test_universal_types.py
feature_repos has setup files for most tests in the test suite.conftest.py (in the parent directory) contains the most common fixtures, which are designed as an abstraction on top of specific offline/online stores, so tests do not need to be rewritten for different stores. Individual test files also contain more specific fixtures.The universal feature repo refers to a set of fixtures (e.g. environment and universal_data_sources) that can be parametrized to cover various combinations of offline stores, online stores, and providers.
This allows tests to run against all these various combinations without requiring excess code.
The universal feature repo is constructed by fixtures in conftest.py with help from the various files in feature_repos.
Tests in Feast are split into integration and unit tests. If a test requires external resources (e.g. cloud resources on GCP or AWS), it is an integration test. If a test can be run purely locally (where locally includes Docker resources), it is a unit test.
test_universal_e2e.pytest_go_feature_server.pytest_python_feature_server.pytest_validation.pytest_push_features_to_offline_store.pytest_push_features_to_online_store.pytest_offline_write.pytest_universal_historical_retrieval.pytest_universal_online.pytest_feature_logging.pytest_universal_online.pytest_lambda.pyDocstring tests are primarily smoke tests to make sure imports and setup functions can be executed without errors.
Let's look at a sample test using the universal repo:
{% tabs %} {% tab code="sdk/python/tests/integration/offline_store/test_universal_historical_retrieval.py" %}
@pytest.mark.integration
@pytest.mark.universal_offline_stores
@pytest.mark.parametrize("full_feature_names", [True, False], ids=lambda v: f"full:{v}")
def test_historical_features(environment, universal_data_sources, full_feature_names):
store = environment.feature_store
(entities, datasets, data_sources) = universal_data_sources
feature_views = construct_universal_feature_views(data_sources)
entity_df_with_request_data = datasets.entity_df.copy(deep=True)
entity_df_with_request_data["val_to_add"] = [
i for i in range(len(entity_df_with_request_data))
]
entity_df_with_request_data["driver_age"] = [
i + 100 for i in range(len(entity_df_with_request_data))
]
feature_service = FeatureService(
name="convrate_plus100",
features=[feature_views.driver[["conv_rate"]], feature_views.driver_odfv],
)
feature_service_entity_mapping = FeatureService(
name="entity_mapping",
features=[
feature_views.location.with_name("origin").with_join_key_map(
{"location_id": "origin_id"}
),
feature_views.location.with_name("destination").with_join_key_map(
{"location_id": "destination_id"}
),
],
)
store.apply(
[
driver(),
customer(),
location(),
feature_service,
feature_service_entity_mapping,
*feature_views.values(),
]
)
# ... more test code
job_from_df = store.get_historical_features(
entity_df=entity_df_with_request_data,
features=[
"driver_stats:conv_rate",
"driver_stats:avg_daily_trips",
"customer_profile:current_balance",
"customer_profile:avg_passenger_count",
"customer_profile:lifetime_trip_count",
"conv_rate_plus_100:conv_rate_plus_100",
"conv_rate_plus_100:conv_rate_plus_100_rounded",
"conv_rate_plus_100:conv_rate_plus_val_to_add",
"order:order_is_success",
"global_stats:num_rides",
"global_stats:avg_ride_length",
"field_mapping:feature_name",
],
full_feature_names=full_feature_names,
)
if job_from_df.supports_remote_storage_export():
files = job_from_df.to_remote_storage()
print(files)
assert len(files) > 0 # This test should be way more detailed
start_time = datetime.utcnow()
actual_df_from_df_entities = job_from_df.to_df()
# ... more test code
validate_dataframes(
expected_df,
table_from_df_entities,
sort_by=[event_timestamp, "order_id", "driver_id", "customer_id"],
event_timestamp = event_timestamp,
)
# ... more test code
{% endtab %} {% endtabs %}
The key fixtures are the environment and universal_data_sources fixtures, which are defined in the feature_repos directories and the conftest.py file. This by default pulls in a standard dataset with driver and customer entities (that we have pre-defined), certain feature views, and feature values.
environment fixture sets up a feature store, parametrized by the provider and the online/offline store. It allows the test to query against that feature store without needing to worry about the underlying implementation or any setup that may be involved in creating instances of these datastores.IntegrationTestRepoConfig which is used by pytest to generate a unique test testing one of the different environments that require testing.Feast tests also use a variety of markers:
@pytest.mark.integration marker is used to designate integration tests which will cause the test to be run when you call make test-python-integration.@pytest.mark.universal_offline_stores marker will parametrize the test on all of the universal offline stores including file, redshift, bigquery and snowflake.full_feature_names parametrization defines whether or not the test should reference features as their full feature name (fully qualified path) or just the feature name itself.environment and universal_data_sources as an argument) to include the relevant test fixtures.universal_offline_stores and universal_online_store markers to parametrize the test against different offline store and online store combinations. You can also designate specific online and offline stores to test by using the only parameter on the marker.@pytest.mark.universal_online_stores(only=["redis"])
pip install -e.FULL_REPO_CONFIGS variable defined in feature_repos/repo_configuration.py. To overwrite this variable without modifying the Feast repo, create your own file that contains a FULL_REPO_CONFIGS (which will require adding a new IntegrationTestRepoConfig or two) and set the environment variable FULL_REPO_CONFIGS_MODULE to point to that file. Then the core offline / online store tests can be run with make test-python-universal.Many problems arise when implementing your data store's type conversion to interface with Feast datatypes.
inference.py so that Feast can infer your datasource schemastype_map.py so that Feast knows how to convert your datastores types to Feast-recognized types in feast/types.py.The most important functionality in Feast is historical and online retrieval. Most of the e2e and universal integration test test this functionality in some way. Making sure this functionality works also indirectly asserts that reading and writing from your datastore works as intended.
data_source_creator.py for your offline store.repo_configuration.py add a new IntegrationTestRepoConfig or two (depending on how many online stores you want to test).
make test-python-integration.feast/infra/offline_stores/contrib/.data_source_creator.py for your offline store and implement the required APIs.contrib_repo_configuration.py add a new IntegrationTestRepoConfig (depending on how many online stores you want to test).make test-python-contrib-universal.repo_configuration.py add a new config that maps to a serialized version of configuration you need in feature_store.yaml to setup the online store.repo_configuration.py, add new IntegrationTestRepoConfig for online stores you want to test.make test-python-integrationtest_universal_types.py for an example of how to do this.@pytest.mark.integration
def your_test(environment: Environment):
df = #...#
data_source = environment.data_source_creator.create_data_source(
df,
destination_name=environment.feature_store.project
)
your_fv = driver_feature_view(data_source)
entity = driver(value_type=ValueType.UNKNOWN)
fs.apply([fv, entity])
# ... run test
brew install redis.
redis-server --help and redis-cli --help should show corresponding help menus../infra/scripts/redis-cluster.sh start then ./infra/scripts/redis-cluster.sh create to start the Redis cluster locally. You should see output that looks like this:Starting 6001
Starting 6002
Starting 6003
Starting 6004
Starting 6005
Starting 6006
./infra/scripts/redis-cluster.sh stop and then ./infra/scripts/redis-cluster.sh clean.