Automating Build & Deploy
Overview
Automating model build and deployment helps maintaining models accurate in production.
This action streamlines the build and deployment workflows. It keeps your model accurate by automatically re-training and deploying based on a cron expression, defined time interval, or metric base triggers.
You can also define a deployment conditions to verify that the new build passes acceptance criteria within the desired parameters before replacing a currently deployed model.
Automation Example
Before Setting Up Automation
Prior to configuring automation, it's essential to have your model's code stored in a Git repository. It's recommended to confirm that all necessary Git repository access is correctly configured via CLI model builds. Ensure the JFrog ML model can successfully build from Git before proceeding with automation.
For additional details on building models from Git, refer to our Build Configurations page.
The automation will fetch the model's code during the training process. In the case of using a private repository, it is necessary to generate a Git access token and securely store the key in the Secret Manager
from qwak.automations import Automation, ScheduledTrigger, QwakBuildDeploy,\
BuildSpecifications, BuildMetric, ThresholdDirection, DeploymentSpecifications
test_automation = Automation(
name="retrain_my_model",
model_id="my-model-id",
trigger=ScheduledTrigger(cron="0 0 * * 0"),
action=QwakBuildDeploy(
build_spec=BuildSpecifications(git_uri="https://github.com/org_id/repository_name.git#dir_1/dir_2",
git_access_token_secret="token_secret_name",
git_branch="main",
main_dir="main",
tags=["prod"],
env_vars=["key1=val1", "key2=val2", "key3=val3"]),
deployment_condition=BuildMetric(metric_name="f1_score",
direction=ThresholdDirection.ABOVE,
threshold="0.65"),
deployment_spec=DeploymentSpecifications(number_of_pods=1,
cpu_fraction=2.0,
memory="2Gi",
variation_name="B")
)
)
Scheduler Timezone
The default timezone for the cron scheduler is UTC.
Build & deploy configuration
The QwakBuildDeploy
action has three configuration parameters:
build_spec
defines the location of the model code that we will build in the JFrog ML platform.deployment_condition
defines the metrics used to determine when to deploy the model after the training.deployment_spec
specifies the runtime environment parameters for model deployment.
Metrics used to trigger build or deploy automations must be logged during the model build phase.
BuildSpecifications
BuildSpecifications
To configure the automation build specification, we need a link to the git repository.
Note that the link consists of two parts delimited by hashtag #
:
- The repository URL
- The path within the repository
For example, when we use this link: https://github.com/org_id/repository_name.git#dir_1/dir2
The platform will clone the https://github.com/org_id/repository_name.git
repository and change the working directory to dir_1/dir_2
before starting the build.
In this example, dir_1/dir_2
should be the directory containing the main
and tests
folders.
Using private repositories
When using private repositories, we must also specify the access token or private key
As the JFrog ML platform doesn't allow the usage of plain text token, we must store the access tokens in the JFrog ML Secret Manager, and specify only the secret name.
When not using the default folder structure, in which main
is the models folder, we must also specify the git branch and the directory containing the ML model
As a best practice,
Custom resources
In the build specification, you may control the number of CPUs, amount of memory or use GPUs Instance Sizes
Defining CPU resources:
resources=CpuResources(cpu_fraction=2, memory="2Gi"))
Defining GPU resources:
resources=GpuResources(gpu_type="NVIDIA_K80", gpu_amount=1)
Alternatively, you can specify the instance type as opposed to fractions of resources. For example:
resources=ClientResources(instance='gpu.a10.8xl') #GPU
#OR
resources=ClientResources(instance='medium') #CPU
It is possible specify the IAM role used in production (assumed_iam_role
) or a custom docker image (base_image
)
Environment variables
Additionally, we can specify the environment variables to configure in the build environment.
he environment variables should be specified with the env_vars field (list), and the value as the following:
key=value
.
The model's code must log the metric that describes the model's performance. We will use the metric in the deployment condition. If you don't know how to do it, look at our Logging and Monitoring Guide.
Disable Push Image
It is possible to disable the push image phase in cases you don't want the final build saved to the docker repository. You can do that by adding push_image=False to the BuildSpecification
BuildMetric
BuildMetric
During the build process, it is common to log metrics such as accuracy, F1 score, or loss. When executing the automation, these logged values may be compared against a specified threshold.
For each metric, it is possible to define whether the value should be above or below the threshold. Once this condition is met, the JFrog ML platform will proceed to deploy the model.
The BuildMetric
object has three parameters:
- metric_name: The metric name we logged during the build phase
- direction: Show the value be below or above the threshold, where the valid values are
ThresholdDirection.ABOVE
,ThresholdDirection.BELOW
- threshold: The threshold used for comparison
The threshold must always be a string, where
threshold="0.65"
is a valid threshold andthreshold=0.65
is invalid!
Dynamic threshold
To use a dynamic threshold, we can use a SQL expression as the threshold value.
In this case, the JFrog ML platform will run the SQL query in JFrog ML Model Analytics and compare the model's metric with the threshold produced by the SQL query.
The query must return a single row containing only one column!
DeploymentSpecifications
DeploymentSpecifications
After we build the model, compared its performance with the threshold, and concluded that the model is ready to be deployed, the platform will use the deployment specification to configure the model's runtime environment.
We may specify:
Parameter | Details |
---|---|
number_of_http_server_workers | The number of threads used by the HTTP server. |
http_request_timeout_ms | The request timeout. |
daemon_mode | Should gunicorn process be daemonized, which makes the workers work in the background. |
custom_iam_role_arn | The IAM role used in production |
max_batch_size | Max batch size of record |
deployment_process_timeout_limit | The timeout for the deployment (in seconds) |
number_of_pods | The number of instances to be deployed |
cpu_fraction | The CPU cores for Kubernetes. |
memory | The amount of RAM |
variation_name | The variant name if we run an A/B test. |
auto_scale_config | The autoscaling configuration for Kubernetes |
min_replica_count | The minimum number of replicas will scale the resource down to |
max_replica_count | The maximum number of replicas of the target resource |
polling_interval | This is the interval to check each trigger on. By default it's every 30 seconds |
cool_down_period | The period to wait after the last trigger reported active before scaling the resource back to 0. By default it's 5 minutes (300 seconds). |
prometheus_trigger | metric_type: The type of the metric - cpu/gpu/memory/latency aggregation_type: The type of the aggregation - min/max/avg/sum time_period: The period to run the query based on threshold: Value to start scaling for |
Environments | List of environment names to deploy to |
Defining auto scaling
When we want to define an auto-scaling policy for our deployment, we have to use the following pattern:
auto_scale_config = AutoScalingConfig(min_replica_count=1,
max_replica_count=10,
polling_interval=30,
cool_down_period=300,
triggers=[
AutoScalingPrometheusTrigger(
query_spec=AutoScaleQuerySpec(
aggregation_type="max",
metric_type="latency",
time_period=4),
threshold=60
)
]
)
Updated 3 months ago