Models Build SDK
Build and deploy ML models effortlessly from your notebook or any Python environment
Overview
Data scientists often train models in Workspaces, Jupyter notebooks or locally, and require a seamless process to save, register and manage model versions for production use.
JFrog ML provides the Build Model SDK to address this need, simplifying the transition from model training to deployment, from your local machine or your Jupyter notebook.
Key Features
1. Build models from Workspaces
JFrog ML Build SDK simplifies the registration of locally or Jupyter notebook-based model training. It ensures precise versioning, tracking, and effortless transition from research to deployment in production.
2. Python-driven model builds
Automate model builds using Python and seamlessly integrate with continuous integration/continuous deployment (CI/CD) pipelines and various automation workflows.
3. Streamlined versioning for pre-built models
Build and register pre-trained models from any Python environment by supplying existing trained model instances, skipping remote build phases.
Using Build SDK
This document will explore the various options available when using the Build SDK.
In general, there are two choices when working with the Build SDK:
- Providing a pre-trained model artifact to the build_model method.
- Omitting a pre-built model, in which case the SDK will upload local model files and builds the model on JFrog ML.
from qwak import QwakClient
from qwak.model.tools import run_local
# Creating an instance of the Qwak client
client = QwakClient()
# Triggering a build with model files from the local `main` directory
# This option does not provide a pre-built model, and the model is build on Qwak
client.build_model(
model_id='my_example_model',
)
# Triggering a build with model files from the local `main` directory
# This option provides a pre-built model, and the build method will not be called remotely.
model = MyQwakModel()
model.build()
client.build_model(
model_id='my_example_model',
prebuilt_qwak_model=model
)
Folder Structure
Note: File and folder structure is important when using the Build SDK, as the files are uploaded to Qwak in that structure.
The Build SDK uploads local model files together with the trained model object. By default, the Build SDK uploads the main
folder under the current file location.
Make sure to place your model files in the main
directory.
-> your-model-directory
---- build_sdk_runner.py
---> main
------ model.py
It's possible to change the uploaded directory by providing an explicit path as will be described in this document.
Building pre-trained models
Use the Build SDK to build models with an existing instance of a trained model to upload the pre-trained model artifact. This flexibility empowers data scientists to train or fine-tune models within notebooks, effortlessly incorporate the trained versions into the model registry, and deploy them to production environments.
Creating a model instance
In this example, we'll use the Titanic model, which can be found on the Qwak Examples repository.
Our folder structure will look as follows:
titanic
-- run_build.py
-- main
---- __init__.py
---- model.py
---- requirements.txt
Note: Make sure to import
from qwak.model.tools import run_local
when using the build SDK. The build command cannot complete without it.
pandas
scikit-learn
catboost
from .model import TitanicSurvivalPrediction
def load_model():
return TitanicSurvivalPrediction()
import numpy as np
import pandas as pd
import qwak
# Important to call run_local when using the Build SDK
from qwak.model.tools import run_local
from catboost import CatBoostClassifier, Pool, cv
from catboost.datasets import titanic
from qwak.model.base import QwakModel
from sklearn.model_selection import train_test_split
class TitanicSurvivalPrediction(QwakModel):
def __init__(self):
self.model = CatBoostClassifier(
iterations=1000,
custom_loss=["Accuracy"],
loss_function="Logloss",
learning_rate=None,
)
def build(self):
titanic_train, _ = titanic()
titanic_train.fillna(-999, inplace=True)
x = titanic_train.drop(["Survived", "PassengerId"], axis=1)
y = titanic_train.Survived
x_train, x_test, y_train, y_test = train_test_split(
x, y, train_size=0.85, random_state=42
)
# mark categorical features
cate_features_index = np.where(x_train.dtypes != float)[0]
self.model.fit(
x_train,
y_train,
cat_features=cate_features_index,
eval_set=(x_test, y_test),
)
# Cross validating the model (5-fold)
cv_data = cv(
Pool(x, y, cat_features=cate_features_index),
self.model.get_params(),
fold_count=5,
)
@qwak.api()
def predict(self, df: pd.DataFrame) -> pd.DataFrame:
df = df.drop(["PassengerId"], axis=1)
return pd.DataFrame(
self.model.predict_proba(df[self.model.feature_names_])[:, 1],
columns=['Survived_Probability']
)
Training the model
Let's create a new model instance and run the build method to train it.
from titanic.main import TitanicSurvivalPrediction
# Create a new model instance
qwak_model_instance = TitanicSurvivalPrediction()
# Run the build function which trains the model
qwak_model_instance.build()
Learning rate set to 0.029583
0: learn: 0.6756870 test: 0.6751626 best: 0.6751626 (0) total: 66.5ms remaining: 1m 6s
1: learn: 0.6578988 test: 0.6579213 best: 0.6579213 (1) total: 69.8ms remaining: 34.8s
2: learn: 0.6427410 test: 0.6427901 best: 0.6427901 (2) total: 72.5ms remaining: 24.1s
Registering the trained model
Now that we trained a model locally, we want to register this model version and save in the in Qwak model register as a new build, so we can later deploy it to production.
The below code will register a new build under the titanic_survival_prediction
model, with the trained titanic model we just created and a tag: prebuilt
from qwak import QwakClient
from qwak.model.tools import run_local
# Creating an instance of the Qwak client
client = QwakClient()
# Triggering a build with model files from the local `main` directory
client.build_model(
model_id='titanic_survival_prediction',
prebuilt_qwak_model=qwak_model_instance, ## Providing a trained instance to skip remote build
tags=['prebuilt']
)
Fetching model code - Using given build ID - 248f3261-da2d-4fdc-8b7a-53f93f5c9909
Fetching model code - Found dependency type: PIP by file: main/requirements.txt
Fetching model code - Successfully fetched model code
Registering qwak build - 10%
Registering qwak build - 20%
Registering qwak build - 30%
Registering qwak build - 40%
Registering qwak build - 50%
Registering qwak build - 60%
Registering qwak build - 70%
Registering qwak build - 80%
Registering qwak build - 90%
Registering qwak build - 100%
Registering qwak build - Start remote build - 248f3261-da2d-4fdc-8b7a-53f93f5c9909
Registering qwak build - Remote build started successfully
Build ID 248f3261-da2d-4fdc-8b7a-53f93f5c9909 was triggered remotely
To follow build logs using Qwak platform:
https://app.qwak.ai/projects/176fc6e5-0725-42af-a574-415066c01a12/titanic_survival_prediction/build/248f3261-da2d-4fdc-8b7a-53f93f5c9909
Build SDK configuration
The Build SDK supports a multitude of parameters which users may configure
Description | Required | Default Value | Description |
---|---|---|---|
model_id | Yes | Model ID on the Qwak platform | |
main_module_path | No | "main" | Path to the local folder where model files exists |
dependencies_file | No | Path to a Python dependencies file, in pip, poetry or conda format. | |
dependencies_list | No | List of strict Python dependencies | |
tags | No | List of tags saved on the remote model build | |
instance | No | "small" | Instance type during mode build |
gpu_compatible | No | Build the model using a GPU compatible image | |
run_tests | No | True | Run tests during model build |
validate_build_artifact | No | True | Validate model deployment during build phase |
validate_build_artifact_timeout | No | Model validation timeout | |
qwak_model | No | Providing a prebuilt QwakModel instance will skip the build phase and use a pre-existing trained model version. |
For example, the below is an example using the advanced features of the Build SDK.
The code snippet using a medium instance to build the model, provide build tags and build a GPU compatible image.
from qwak import QwakClient
from qwak.model.tools import run_local
qwak_model = TitanicSurvivalPrediction()
qwak_model.build()
# Creating an instance of the Qwak client
client = QwakClient()
client.build_model(
model_id='titanic_survival_prediction',
main_module_path='main',
dependencies_file="requirements.txt",
prebuilt_qwak_model=qwak_model,
tags=['prebuilt', 'local'],
instance="medium",
gpu_compatible=True
)
Unsupported parameters in Build SDK
The build SDK support most of the parameters that are supported in the Qwak CLI under qwak models build
The following parameters are not supported:
environment | Qwak environment |
purchase-option | Receiving only the build id and any exception as return values (Depends on --programmatic in order to avoid UI output) |
deployment-instance | The instance size to automatically deploy the build after completion |
deploy | Automatically deploy build after completion |
json-logs | Return the live build logs as JSON |
param-list | Provide a list of parameters to the build |
main-dir | Change the name of the main model directory |
env-vars | Provide a list of environment variables |
base-image | Change the base image of the model build |
--cache / -no-cache | Use or disable docker cache |
git-credentials | Provide git credentials token |
git-credentials-secret | The git credentials secret |
git-branch | Use a different git branch |
Updated 4 months ago