Storage-Based Execution

The low-level API of the batch execution allows you to start and execution from a specific remote path, and write the results to another path.

Each file in the input path is translated to a single unit of processing (a Qwak task). Each input file is read and passed as a single chunk to the model predict function.

🚧

Task Processing

As the file is transformed to a single prediction request, you must ensure that the model is deployed on an instance with sufficient resources to handle the batch request.

Where batch predictions are resource-heavy, consider splitting the input dataset into multiple smaller files.

Execution Configuration

ParamaterDescriptionDefault Value
Model ID [Required]The Model ID, as displayed on the model header.
Build IDThe Qwak-assigned build ID. You can optionally add this to use a different build in order to perform the execution.
BucketThe source and destination bucket. If you read and write to the same bucket, you can specify this, but it is not mandatory.

Note: This parameter is required if 'Source Bucket' and 'Destination Bucket' are not set.
Source bucketThe bucket from which the input files are read, to start an execution.

Note: This parameter is required if 'Bucket' is not set.
Destination bucketThe bucket into which the execution output files are written.

Note: This parameter is required if 'Bucket' is not set.
Source folder [Required]The path to the source bucket where all the inference files are located.
Destination Folder [Required]The path to the destination bucket where the result files are stored.
Input File TypeThe file types supported by Qwak. The supported formats are: CSV, Parquet and Feather.CSV
Output File TypeThe types of the files stored by Qwak. The supported formats are: CSV, Parquet and Feather.CSV
Access Token NameThe name of the secret (created using our Secret Service) that contains the Access Token with permission to the source and destination buckets.
Access Secret NameThe name of the secret (created using our Secret Service) that contains the Access Secret with permission to the source and destination buckets.
Job TimeoutThe job timeout, in seconds. By setting the job timeout, you will limit the execution time, and it will fail if not completed in time.0 - No Timeout
File TimeoutA single file timeout, in seconds. Setting this will limit the processing time for a single file, and fail the entire execution if one of the files does not finish in time.0 - No Timeout
IAM role ARNThe user-provided AWS custom IAM role.None
PodsThe number of k8s pods which will be used at batch inference time.

Number of pods sets the maximum parallelism for an inference job.

Each pod handles one or more files/tasks. This configuration takes precedence over the deployed model configuration.
The number of executors/pods defined in the deployment.
CPU fractionThe CPU fraction that is allocated to the pod. The CPU resource is measured in CPU units. One CPU, in Qwak, is equivalent to:
1 AWS vCPU
1 GCP Core
1 Azure vCore
1 Hyperthread on a bare-metal Intel processor with Hyperthreading
This configuration takes precedence over the deployed model configuration.
The number of CPU's defined in the deployment
MemoryThe amount of RAM memory (MB) to allocate to each pod.

This configuration takes precedence over the deployed model configuration.
The amount of memory defined in the deployment.
GPU TypeThe GPU Type to use in batch execution. Supported options are, NVIDIA K80, NVIDIA Tesla V100, and NVIDIA T4.None
GPU AmountThe number of GPUs available for the batch execution.
Varies based on the selected GPU type.
Based on GPU type
ParamatersA list of parameters expressed as key-value pairs which will be passed to the execution request.

The parameters are passed to the inference container as environment variables
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::<AWS_ACCOUNT_ID>:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/<OIDC_EKS_CLUSTER_ID>"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.us-east-1.amazonaws.com/id/<OIDC_PROVIDER_EKS_CLUSTER_ID>:aud": "sts.amazonaws.com"
                },
                "ForAnyValue:StringEquals": {
                    "oidc.eks.us-east-1.amazonaws.com/id/<OIDC_PROVIDER_EKS_CLUSTER_ID>:sub": [
                        "system:serviceaccount:qwak:kube-deployment-captain-access"
                    ]
                }
            }
        }
    ]
}

Batch Execution

To start an execution from the SDK, use the following command:

from qwak.models.executions.logic.client import BatchJobManagerClient
from qwak.models.executions.logic.results import StartExecutionResult
from qwak.models.executions.logic.executions_config import ExecutionConfig

// The execution configuration
execution_spec = ExecutionConfig.Execution(
    model_id=<model-id>,
    source_bucket=<source-bucket-name>,
    destination_bucket=<destination-bucket-name>,
    source_folder=<source-folder-path>,
    destination_folder=<destination-folder-path>,
    input_file_type=<input-file-type>,
    output_file_type=<output-file-type>,
    access_token_name=<access_token_name>,
    access_secret_name=<access-secret-name>,
    job_timeout=<job-timeout>,
    file_timeout=<file-timeout>,
    parameters=<dictionary of user provided paramaters>
)

resources_config = ExecutionConfig.Resources(
    pods=<number-of-pods>,
    cpus=<number-of-cpus>,
    memory=<memory-amount>,
    gpu_type=<gpu_type>,
    gpus=<number-of-gpus>,
)

execution_config = ExecutionConfig(execution=execution_spec, resources=resources_config)
batch_job_manager_client = BatchJobManagerClient()

execution_result: StartExecutionResult = batch_job_manager_client.start_execution(execution_config)
execution_id = execution_result.execution_id
qwak models execution start \                                                                                                                            ξ‚² TERM ✘ ξ‚² 8m 34s ο‰’ ξ‚² base  ξ‚² 12:58:22 ο€—
    --model-id <model-id> \
    --source-bucket <source-bucket-name> \
    --source-folder <source-folder-path> \
    --destination-bucket <destination-bucket-name> \    
    --destination-folder <destination-folder-path> \
    --input-file-type <input-file-type> \
    --output-file-type <output-file-type> \
    --access-token-name <buckets-access-token-secret-name> \
    --access-secret-name <buckets-access-secret-secret-name> \
    --job-timeout <entire-job-timeout-in-seconds> \
    --file-timeout <single-file-timeout-in-seconds> \
    --pods <pods-count> \
    --cpus <cpus-fraction> \
    --memory <memory-size> \
    --build-id <alternate-build-id>

Here is a simplified version with all the default values:

from qwak.models.executions.logic.client import BatchJobManagerClient
from qwak.models.executions.logic.results import StartExecutionResult
from qwak.models.executions.logic.executions_config import ExecutionConfig

// The execution configuration
execution_spec = ExecutionConfig.Execution(
    model_id=<model-id>,
    bucket=<bucket-name>,
    destination_bucket=<destination-bucket-name>,
    source_folder=<source-folder-path>,
    destination_folder=<destination-folder-path>,
    access_token_name=<access_token_name>,
    access_secret_name=<access-secret-name>
)

execution_config = ExecutionConfig(execution=execution_spec)
batch_job_manager_client = BatchJobManagerClient()

execution_result: StartExecutionResult = batch_job_manager_client.start_execution(execution_config)
execution_id = execution_result.execution_id
qwak models execution start \                                                                                                                            ξ‚² TERM ✘ ξ‚² 8m 34s ο‰’ ξ‚² base  ξ‚² 12:58:22 ο€—
    --model-id <model-id> \
    --bucket <bucket-name> \
    --source-folder <source-folder-path> \
    --destination-folder <destination-folder-path> \
    --access-token-name <buckets-access-token-secret-name> \
    --access-secret-name <buckets-access-secret-secret-name>

Batch Job Parallelism

Note that every file in the given input path is considered a task. A task is the main unit of parallelism for a batch execution job.

For example, if five pods are requested during the batch execution (whether specified by the deployment or in the batch execution job itself), and 10 files need to processed, five files (or tasks) are executed in parallel, out of the 10 tasks that comprise the batch job.

For this reason, there is no point in requesting more pods than the number of files which need to be processed.

πŸ“˜

Concurrent Executions

You can run multiple executions, concurrently. The only limitation is not running executions with identical values for the following parameters:

  1. Model Id
  2. Build Id
  3. Source Bucket
  4. Source Folder
  5. Destination Bucket
  6. Destination Folder

The assumption is that running two executions with the same parameters, is redundant.


What’s Next

Next, learn regarding the different options to manage and get visibility on an execution