Inference Analytics

Automatic log and inference collections on the JFrog ML Lake

The model Analytics tab provides an interface to the JFrog ML Lake, which is an automated log collection system for models.

In addition to performance data, you can also find all the predictions that were made with models deployed via JFrog ML, including the input and output data of each function of your models.

The data is stored as parquet files in your Object storage, and you can also load it into your favorite BI tool and analyze the Model data with your tools.

Enabling JFrog ML Lake analytics

JFrog ML Analytics collection is enabled by default when using the api decorator.

@qwak.api()
def predict(self, df):
    return pd.DataFrame(self.catboost.predict(df[self.columns]), columns=['churn'])

Note it can be turned off by passing analytics=False to the decorator

@qwak.api(analytics=False)
def predict(self, df):
    return pd.DataFrame(self.catboost.predict(df[self.columns]), columns=['churn'])

📘

Analytics columns are defined based on the naming conventions of input variables within the predict() method. When utilizing the default df parameters, these columns commence with input_. Conversely, if you've specified custom parameters, the columns will initiate with the name of your parameter.

For instance, if your predict signature reads as follows: def predict(self, request) -> String, then your analytics input columns will begin with request_.

You can also configure columns to be excluded from analytics. To do that, you should configure the decorator with the column names:

@qwak.api(analytics_exclude_columns=['col_1', 'col_2'])


Querying analytics in the UI

In the Analytics view, you can write SQL queries to analyze the model requests and predictions:

🚧

Leveraging Partitions in Queries

Model Inference data is partitioned daily according to the date column. To improve query performance and avoid scanning through all the data which can be significantly slower (and costlier), please leverage this partitioning scheme in your analytics queries.



Retrieving analytics programmatically

To retrieve data from JFrog ML Analytics Engine into a Pandas Dataframe use the run_analytics_query function of the QwakClient:

from qwak import QwakClient

client = QwakClient()  
df = client.run_analytics_query("select * from your_table")

When you call the code as shown below, the function will wait until the result is ready (or until the query fails for whatever reason).

However, you can also control how long you want to wait for the result by passing the timeout parameter to the get_analytics_data function. If the JFrog ML Analytics Engine won't return a response within a given time window, the client will raise a TimeoutError.

from datetime import timedelta  
from qwak import QwakClient

client = QwakClient()  
df = client.run_analytics_query("select \* from your_table", timeout=timedelta(seconds=123))


Logging custom values

A model's predict function can log custom data during the inference request. To use the custom data logger, we need to add the analytics_logger parameter to the predict function. Important: The parameter MUST be called analytics_logger!


@qwak.api(analytics=True)
def predict(self, df, analytics_logger):
    ...

The feature works only when the analytics feature of the JFrog ML API is enabled (it's enabled by default, or we can explicitly specify the analytics=True parameter).

Now, in the predict function, we can log any scalar value, lists, dictionaries, Pandas DataFrame, and any other JSON serializable object. The analytics_logger supports two ways of logging the values:

  1. One-by-one:
analytics_logger.log(column=’my_column’, value=the_value)
analytics_logger.log(column=’some_other_column’, value=yet_another_value)
  1. Multiple values at once:
analytics_logger.log_many(
    values={‘another_column’: ‘some_value’, 'something_else': 123}
)

Note that we use different function when we log multiple values (log_many instead of log)!

If you log different values with the same column name, only the last logged value will be logged (it overwrites previous logs).



Retrieving custom values

The JFrog ML Analytics view in the JFrog ML UI will display all the logged values with the column prefix logger_.

If we log: analytics_logger.log(column=’my_column’, value=the_value),JFrog ML Analytics displays a column logger_my_column with a value retrieved from the variable the_value.