Getting Features for Training

In Getting Started, we showed you how to retrieve a training data sample from the Feature Store.

Below, we show this example again:

from qwak.feature_store.offline import OfflineFeatureStore

offline_feature_store = OfflineFeatureStore()
    data = offline_feature_store.get_sample_data(feature_set_name='user_credit_risk_features_v2', number_of_rows=999)

Alternatively, we can retrieve data from the OfflineFeatureStore by entity id and the last modification timestamp.

In this case, we must define:

  • The filter DataFrame containing the entity id and the point-in-time column name.
  • A list of features that we want to retrieve.
import pandas as pd
from qwak.feature_store.offline import OfflineFeatureStore

df = pd.DataFrame(columns=[                 'user_id', 'timestamp'                ],
                  data   =[[ '06cc255a-aa07-4ec9-ac69-b896ccf05322', '2021-01-01 00:00:00']])
             
key_to_features = {'user_id': ['user_credit_risk_features_v2.checking_account',
                            'user_credit_risk_features_v2.age',
                            'user_credit_risk_features_v2.job']}
                            
offline_feature_store = OfflineFeatureStore()

train_df = offline_feature_store.get_feature_values(
    entity_key_to_features=key_to_features,
    population=df,
    point_in_time_column_name='timestamp')