Back to Feast

Load feature values into the Feature store

docs/tutorials/azure/notebooks/part1-load-data.ipynb

0.63.05.6 KB
Original Source

Load feature values into the Feature store

The objective of this tutorial is to build a model that predicts if a driver will complete a trip based on a number of features ingested into Feast.

This notebook creates you will create and load data into the following 3 feature tables.

Customer Profile: This contains features related to a customer entity such as current balance, lifetime trip count, average number of passengers per trip. A snippet of data:

datetimecustomer_idcurrent_balanceavg_passenger_countlifetime_trip_countcreated
2021-03-17 19:3110100.8891880.0490574122021-03-24 19:38
2021-03-18 19:3110100.9792730.2126306392021-03-24 19:38
2021-03-19 19:3110100.9765490.176881702021-03-24 19:38
2021-03-20 19:3110100.2736970.325012682021-03-24 19:38

Driver table: This contains features related to a driver entity such as conversion rate, average number of daily trips. A snippet of data:

datetimedriver_idconv_rateacc_rateavg_daily_tripscreated
2021-03-17 19:3150100.2292970.6858438612021-03-24 19:34
2021-03-17 20:3150100.7816550.8612807692021-03-24 19:34
2021-03-17 21:3150100.1503330.5255817782021-03-24 19:34
2021-03-17 22:3150100.9517010.2288835702021-03-24 19:34

Orders table: This is a typical fact table that contains the order information such driver/customer id and whether the trip was completed. A snippet of data:

order_iddriver_idcustomer_idorder_is_successevent_timestamp
1005004100702021-03-10 19:31:15
1015003100602021-03-11 22:02:50
1025010100502021-03-13 00:34:24
1035010100112021-03-14 03:05:59

Imports

python
import pandas as pd
from sqlalchemy import create_engine
from sqlalchemy.sql import text
from azureml.core import Workspace

ws = Workspace.from_config()
kv = ws.get_default_keyvault()

engine = create_engine(kv.get_secret("FEAST-OFFLINE-STORE-CONN"))

Create Customer profile table and load data

The cell below will create the customer profile table and load the data into the Synapse table. Loading is achieved using the ``COPY INTO` bulk load available in Synapse (the CSV data is available on public blob):

sql
COPY INTO dbo.customer_profile
FROM 'https://feastonazuredatasamples.blob.core.windows.net/feastdatasamples/customer_profile.csv'
WITH
(
	FILE_TYPE = 'CSV'
	,FIRSTROW = 2
	,MAXERRORS = 0
)
python
with engine.connect() as con:
    # create and load customer profile table
    file = open("../sql/create_cx_profile_table.sql")
    query = text(file.read())
    print("creating customer profile table...", end="")
    con.execute(query)
    print("done")
    file = open("../sql/load_cx_profile_data.sql")
    query = text(file.read())
    print("loading customer profile data...", end="")
    con.execute(query)
    print("done")

pd.read_sql_query("select top 10 * from dbo.customer_profile", engine)

Create drivers table and load data

The cell below will create the drivers table and load the data into the Synapse table. Loading is achieved using the ``COPY INTO` bulk load available in Synapse (the CSV data is available on public blob):

sql
COPY INTO dbo.driver_hourly
FROM 'https://feastonazuredatasamples.blob.core.windows.net/feastdatasamples/driver_hourly.csv'
WITH
(
	FILE_TYPE = 'CSV'
	,FIRSTROW = 2
	,MAXERRORS = 0
)
python
with engine.connect() as con:
    file = open("../sql/create_drivers_table.sql")
    query = text(file.read())
    print("creating drivers table...", end="")
    con.execute(query)
    print("done")
    file = open("../sql/load_drivers_data.sql")
    query = text(file.read())
    print("loading drivers data...", end="")
    con.execute(query)
    print("done")

pd.read_sql_query("select top 10 * from dbo.driver_hourly", engine)

Create orders table and load data

The cell below will create the orders table and load the data into the Synapse table. Loading is achieved using the ``COPY INTO` bulk load available in Synapse (the CSV data is available on public blob):

sql
COPY INTO dbo.orders
FROM 'https://feastonazuredatasamples.blob.core.windows.net/feastdatasamples/orders.csv'
WITH
(
	FILE_TYPE = 'CSV'
	,FIRSTROW = 2
	,MAXERRORS = 0
)
python
with engine.connect() as con:
    file = open("../sql/create_orders_table.sql")
    query = text(file.read())
    print("creating orders table...", end="")
    con.execute(query)
    print("done")
    file = open("../sql/load_orders_data.sql")
    query = text(file.read())
    print("loading orders data...", end="")
    con.execute(query)
    print("done")

pd.read_sql_query("select top 10 * from dbo.orders", engine)

Next Steps

With the feature values loaded into the feature store, you will need to register the features in the feast central registry. Follow the Register Features part of the tutorial.