Getting Features for Inference

This tutorial shows how to access data stored in the Qwak Feature Store during the inference.

Using OnlineFeatureStore Explicitly

In the predict function, we create an instance of the OnlineFeatureStore.

After that, we create a ModelSchema containing all of the features we want to retrieve.

We create a DataFrame containing the entities identifiers and pass it to the Feature Store. As a response, we get a Pandas DataFrame with the requested features.

πŸ“˜

Note

The order of the extracted features is by index, for example, the features for entity value at index 1 are located at index number 1.

import pandas as pd
from qwak.feature_store.online import OnlineFeatureStore
from qwak.model.schema import ModelSchema, FeatureStoreInput


model_schema = ModelSchema(
    features=[
        FeatureStoreInput(name='user_credit_risk_features.checking_account'),
        FeatureStoreInput(name='user_credit_risk_features.age'),
        FeatureStoreInput(name='user_credit_risk_features.job'),
        FeatureStoreInput(name='user_credit_risk_features.duration'),
        FeatureStoreInput(name='user_credit_risk_features.credit_amount'),
        FeatureStoreInput(name='user_credit_risk_features.housing'),
        FeatureStoreInput(name='user_credit_risk_features.purpose'),
        FeatureStoreInput(name='user_credit_risk_features.saving_account'),
        FeatureStoreInput(name='user_credit_risk_features.sex'),
        FeatureStoreInput(name='liked_posts.count')
    ])
    
online_feature_store = OnlineFeatureStore()

df = pd.DataFrame(columns=[               'user_id',               'post_id'],
                  data   =[['06cc255a-aa07-4ec9-ac69-b896ccf05322',   '1234'],
                           ['asdc255a-aa07-4ec9-ac69-b896c1231445',   '7889']])
                  
user_features = online_feature_store.get_feature_values(
                    model_schema,
                    df)

Resulting Extracted DataFrame

Note that the join for the entities values and the extracted DataFrame is done on the DataFrame index, meaning that the features for user_id value: 06cc255a-aa07-4ec9-ac69-b896ccf05322 will be at index 0 because the entity value is in index 0. This is also true for "OnTheFly" or "RealTimeFeatures", so ensure in your registered UDF that the output rows length are always the same as in the input DataFrame and the ordering is the same in relation to the entity values.


    checking_account age job duration credit_amount housing purpose saving_account sex count
0        1          20    A   50        20,000      True   any         False        M   1
1        2          27    B   60        12,000      True   any         False        F   1

Using the Decorator

Alternatively, we could use the features_extraction decorator and get the features automatically extracted. In this case, we have to implement the model's schema method and use it to define the features we want to extract:

def schema(self):
    from qwak.model.schema import ModelSchema, FeatureStoreInput, Prediction
    model_schema = ModelSchema(
        features=[
            FeatureStoreInput(name='user_credit_risk_features.checking_account'),
            FeatureStoreInput(name='user_credit_risk_features.age'),
            FeatureStoreInput(name='user_credit_risk_features.job'),
            FeatureStoreInput(name='user_credit_risk_features.duration'),
            FeatureStoreInput(name='user_credit_risk_features.credit_amount'),
            FeatureStoreInput(name='user_credit_risk_features.housing'),
            FeatureStoreInput(name='user_credit_risk_features.purpose'),
            FeatureStoreInput(name='user_credit_risk_features.saving_account'),
            FeatureStoreInput(name='user_credit_risk_features.sex'),
        ],
        predictions=[
            Prediction(name="duration", type=float)
        ])
    return model_schema

In the predict function, we add the decorator and an additional parameter. Note that the extracted_df variable contains raw features from the feature store. If these values need preprocessing before passing them to the model, you still have to do it!

@qwak.api(feature_extraction=True)
def predict(self, df, extracted_df):
    return self.model.predict(extracted_df) 

REST Example

First, you need to generate a token (valid for 24 hours), using this command:

curl --request POST 'https://grpc.qwak.ai/api/v1/authentication/qwak-api-key' \
     --header 'Content-Type: application/json' \
     --data '{"qwakApiKey": "<API_Key>"}'

With your assigned API key. Using the accessToken field (in the following example assigned to QWAK_TOKEN, the feature extraction command is:

import requests
import json

url = "https://grpc.<Environment name>.qwak.ai/api/v1/rest-serving/multiFeatureValues/"

payload = json.dumps({
  "entitiesToFeatures": [
    {
      "features": [
        {
          "batchFeature": {
            "name": "user_credit_risk_features.checking_account"
          }
        },
        {
          "batchFeature": {
            "name": "user_credit_risk_features.age"
          }
        }
      ],
      "entityName": "user_id"
    }
  ],
  "entityValuesMatrix": {
    "header": {
      "entityNames": [
        "user_id"
      ]
    },
    "rows": [
      {
        "index": 0,
        "entityValues": [
          "06cc255a-aa07-4ec9-ac69-b896ccf05322"
        ]
      }
    ]
  }
})
headers = {
  'Content-Type': 'application/json',
  'Authorization': 'Bearer $QWAK_TOKEN'
}

response = requests.request("POST", url, headers=headers, data=payload)

print(response.text)