As an example, we will use the well-known Iris Classifier SVM model, which typically looks as follows:
# train.py from sklearn import svm from sklearn import datasets # Load training data iris = datasets.load_iris() X, y = iris.data, iris.target # Model Training clf = svm.SVC(gamma='scale') clf.fit(X, y)
Below, we show the default (and the recommended) structure of a model project.
The Qwak platform offers multiple ways to customize the directory structure. We show them in other build tutorials.
qwak_based_model/ ├── main/ │ ├── __init__.py # Required for exporting model from main │ ├── model.py # Qwak model definition. │ ├── util.py # Represents any helper/util python modules. │ ├── <dependecies> # Dependencies file - Poetry/PyPI/Conda ├── tests/ │ ├── ut/ # Unit tests │ │ ├── sample_test.py │ ├── it/ # Integration tests │ │ ├── sample_test.py
First, we will generate the directory structure for our Qwak-based model as described above.
To do so, you can use the following command:
qwak models init \ --model-directory <model-dir-path> \ --model-class-name <model-class> \ <dest>
<model-dir-path>: Model directory name.
<model-class>: Qwak-based model class name (Camel case)
<dest>: Destination path on local host.
As an example, we will use the following command:
qwak models init \ --model-directory iris_model \ --model-class-name IrisClassifier \ ~/
That will create a new directory named
iris_model at the user's home directory.
main is the most important directory of a Qwak project.
Everything that is supposed to be part of the model artifact should be located in it.
The first step is creating a model class, which defines the two mandatory functions:
build- defines the model training / loading logic, invoked on build time.
predict- defines the serving logic, invoked on every inference request.
And two optional ones:
schema- defines the model interface - input and the output of your model.
initialize_model- invoked when the model is loaded during the serving container initialization.
Read more about Qwak's model class method and how they can be used in the dedicated section.
For example, we can implement the Iris classifier in the following way:
import pandas as pd from sklearn import svm, datasets from qwak import api, QwakModelInterface from qwak.model.schema import ModelSchema, Prediction, ExplicitFeature class IrisClassifier(QwakModelInterface): def __init__(self): self._gamma = 'scale' self._model = None def build(self): iris = datasets.load_iris() X, y = iris.data, iris.target clf = svm.SVC(gamma=self._gamma) self._model = clf.fit(X, y) @api() def predict(self, df: pd.DataFrame) -> pd.DataFrame: return pd.DataFrame(data=self._model.predict(df), columns=['species']) def schema(self): return ModelSchema( features=[ ExplicitFeature(name="sepal_length", type=float), ExplicitFeature(name="sepal_width", type=float), ExplicitFeature(name="petal_length", type=float), ExplicitFeature(name="petal_width", type=float) ], predictions=[ Prediction(name="species", type=str) ])
main directory should be a valid Python module, meaning it should include a
__init__.py file lets the Python interpreter know that a directory contains code for a Python module (However the file must exist but its content isn't mandatory, It used for constructor fields initialization). This file should set up the imports for the Qwak model class, so it will be picked up by the model build process - in one of two ways:
from .model import IrisClassifier
from .model import IrisClassifier def load_model(): return IrisClassifier()
The load model function gives more control over how the model class should be initialized during the build process.
Most projects depend on external packages to build and run correctly. Qwak downloads and links the dependencies on build time - based on a Python virtual environment which is used both for build and serving contexts.
Qwak supports the following types of dependency descriptors. Pick one! Do not include multiple dependency configuration files at once.
qwak-sdkdoes not need to be added as an explicit dependency. During build time, the version of the Qwak SDK used to build the model is automatically injected.
conda.yaml should be stored in the
main directory. For example:
name: iris-calssifier channels: - defaults - conda-forge dependencies: - python=3.8 - pip=20.0.3 - scikit-learn=1.0.1
requirements.txt in the
In this case, the
main directory should contain the
[tool.poetry] name = "iris-classifier" version = "0.1.0" description = "" authors = ["Your Name <[email protected]>"] [tool.poetry.dependencies] python = "^3.8" scikit-learn = "1.0.1" [tool.poetry.dev-dependencies] [build-system] requires = ["poetry-core>=1.0.0"] build-backend = "poetry.core.masonry.api"
It's also possible to store a Qwak-compatible model in a Python package and add it as a dependency to the Qwak model. We have to implement the model class and build it as a Python package. Note that our package needs
qwak-sdk as its dependency.
In our example, we assume that we have created a package called
importqwakmodelfrompackage that contains a submodule
model with the
TestModel class. The
TestModel class implements the
Later, when we configure the Qwak model:
Create a new Qwak model using the
qwak models initcommand.
Add the package to the dependencies.
For example, if we use pip to manage dependencies and the whl file is stored in a GitHub repository, we must add the following line to our
If we use a different package manager, we should follow their instructions regarding adding whl files or private package repositories as dependencies.
We must remove the content of the
model.pyfile, we import the Qwak model class (the one added as a dependency in pip) and implement the
load_modelfunction to return the class (not an instance of the class!!!)
from importqwakmodelfrompackage.model import TestModel def load_model(): return TestModel
The directory tests are where tests of each component in the model reside.
This Python file will define unit tests that will run during the build process. For example, if we will define a helper function in
<model-dir>/main/util.py as follows:
def add(x, y): return x + y
Then we can define the following test:
from main.util import add def add_test(): assert add(3,2) == 5
This Python file will define integration tests we will run during the build process after the serving container is built and initialized.
During the integration tests, a real model deployment will be running, and you will be able to perform inference using pytest fixture, which will be auto-configured with a client that can be invoked against the model currently being built.
import pandas as pd from qwak_mock import real_time_client def test_iris_classifier(real_time_client): feature_vector = [[5.1, 3.5, 1.4, 0.2]] iris_type = real_time_client.predict(feature_vector) assert iris_type == 1
- Files operations - The current working directory of build execution is the root directory of the model. For example, if a file is located in
./main/sample.txtand you want to read it simply open it in path
.represents the model root directory.
- Model fields - Model fields should be objects which can be pickled, S3 client, for example, can't pickle due to the fact that the session should remain active.
By default, the Qwak-SDK does not copy any other directories from the build directory, only
However, it is possible to modify this behavior by adding the
--dependency_required_folders parameter to the model build command.
qwak models build --model-id your_model_id --dependency_required_folders additional_dir . qwak models build --model-id your_model_id --dependency_required_folders additional_dir --dependency_required_folders some_other_dir .
Updated 9 months ago