Automating Build & Deploy

Overview

Automating model build and deployment helps maintaining models accurate in production.

This action streamlines the build and deployment workflows. It keeps your model accurate by automatically re-training and deploying based on a cron expression, defined time interval, or metric base triggers.

You can also define a deployment conditions to verify that the new build passes acceptance criteria within the desired parameters before replacing a currently deployed model.

Automation Example

❗️
Before configuring the automation, it is essential to store your model's code in a Git repository.

The automation will fetch the model's code during the training process. In the case of using a private repository, it is necessary to generate a Git access token and securely store the key in the Qwak Secret Service.

from qwak.automations import Automation, ScheduledTrigger, QwakBuildDeploy,\
    BuildSpecifications, BuildMetric, ThresholdDirection, DeploymentSpecifications

test_automation = Automation(
    name="retrain_my_model",
    model_id="my-model-id",
    trigger=ScheduledTrigger(cron="0 0 * * 0"),
    action=QwakBuildDeploy(
        build_spec=BuildSpecifications(git_uri="https://github.com/org_id/repository_name.git#dir_1/dir_2",
                                       git_access_token_secret="token_secret_name",
                                       git_branch="main",
                                       main_dir="main",
                                       tags=["prod"],
                                       env_vars=["key1=val1", "key2=val2", "key3=val3"]),
        deployment_condition=BuildMetric(metric_name="f1_score",
                                         direction=ThresholdDirection.ABOVE,
                                         threshold="0.65"),
        deployment_spec=DeploymentSpecifications(number_of_pods=1,
                                                 cpu_fraction=2.0,
                                                 memory="2Gi",
                                                 variation_name="B")
    )
)

Build & deploy configuration

The QwakBuildDeploy action has three configuration parameters:

build_spec defines the location of the model code that we will build in the Qwak platform.
deployment_condition defines the metrics used to determine when to deploy the model after the training.
deployment_spec specifies the runtime environment parameters for model deployment.

❗️
Metrics used to trigger build or deploy automations must be logged during the model build phase.

`BuildSpecifications`

To configure the automation build specification, we need a link to the git repository.

Note that the link consists of two parts delimited by hashtag #:

The repository URL
The path within the repository

For example, when we use this link: https://github.com/org_id/repository_name.git#dir_1/dir2

The platform will clone the https://github.com/org_id/repository_name.git repository and change the working directory to dir_1/dir_2 before starting the build.

In this example, dir_1/dir_2 should be the directory containing the main and tests folders.

Using private repositories

When using private repositories, we must also specify the access token.

As the Qwak platform doesn't allow the usage of plain text token, we must store the access tokens in the Qwak Secret Manager, and specify only the secret name.

When not using the default folder structure, in which main is the models folder, we must also specify the git branch and the directory containing the ML model

Custom resources

In the build specification, you may control the number of CPUs, amount of memory or use GPUs (Remote build Resources)

Defining CPU resources:

resources=CpuResources(cpu_fraction=2, memory="2Gi"))

Defining GPU resources:

resources=GpuResources(gpu_type="NVIDIA_K80", gpu_amount=1)

It is possible specify the IAM role used in production (assumed_iam_role) or a custom docker image (base_image)

Environment variables

Additionally, we can specify the environment variables to configure in the build environment.

he environment variables should be specified with the env_vars field (list), and the value as the following:
key=value.

The model's code must log the metric that describes the model's performance. We will use the metric in the deployment condition. If you don't know how to do it, look at our Logging and Monitoring Guide.

Disable Push Image

It is possible to disable the push image phase in cases you don't want the final build saved to the docker repository. You can do that by adding push_image=False to the BuildSpecification

`BuildMetric`

During the build process, it is common to log metrics such as accuracy, F1 score, or loss. When executing the automation, these logged values may be compared against a specified threshold.

For each metric, it is possible to define whether the value should be above or below the threshold. Once this condition is met, the Qwak platform will proceed to deploy the model.

The BuildMetric object has three parameters:

metric_name: The metric name we logged during the build phase
direction: Show the value be below or above the threshold, where the valid values are ThresholdDirection.ABOVE, ThresholdDirection.BELOW
threshold: The threshold used for comparison

❗️
The threshold must always be a string, where threshold="0.65" is a valid threshold and threshold=0.65 is invalid!

Dynamic threshold

To use a dynamic threshold, we can use a SQL expression as the threshold value.

In this case, the Qwak platform will run the SQL query in Qwak Model Analytics and compare the model's metric with the threshold produced by the SQL query.

The query must return a single row containing only one column!

`DeploymentSpecifications`

After we build the model, compared its performance with the threshold, and concluded that the model is ready to be deployed, the platform will use the deployment specification to configure the model's runtime environment.

We may specify:

Parameter	Details
number_of_http_server_workers	The number of threads used by the HTTP server.
http_request_timeout_ms	The request timeout.
daemon_mode	Should gunicorn process be daemonized, which makes the workers work in the background.
custom_iam_role_arn	The IAM role used in production
max_batch_size	Max batch size of record
deployment_process_timeout_limit	The timeout for the deployment (in seconds)
number_of_pods	The number of instances to be deployed
cpu_fraction	The CPU cores for Kubernetes.
memory	The amount of RAM
variation_name	The variant name if we run an A/B test.
auto_scale_config	The autoscaling configuration for Kubernetes
min_replica_count	The minimum number of replicas will scale the resource down to
max_replica_count	The maximum number of replicas of the target resource
polling_interval	This is the interval to check each trigger on. By default it's every 30 seconds
cool_down_period	The period to wait after the last trigger reported active before scaling the resource back to 0. By default it's 5 minutes (300 seconds).
prometheus_trigger	metric_type: The type of the metric - cpu/gpu/memory/latency aggregation_type: The type of the aggregation - min/max/avg/sum time_period: The period to run the query based on threshold: Value to start scaling for
Environments	List of environment names to deploy to

Defining auto scaling

When we want to define an auto-scaling policy for our deployment, we have to use the following pattern:

auto_scale_config = AutoScalingConfig(min_replica_count=1,
                                      max_replica_count=10,
                                      polling_interval=30,
                                      cool_down_period=300,
                                      triggers=[
                                          AutoScalingPrometheusTrigger(
                                              query_spec=AutoScaleQuerySpec(
                                                  aggregation_type="max",
                                                  metric_type="latency",
                                                  time_period=4),
                                              threshold=60
                                          )
                                      ]
                                      )