Traffic Splitting with Audiences and Variations

Deploy models with traffic splitting and A/B testing using Audiences and Variations

JFrog ML provides multiple ways to split traffic and run A/B tests using Audiences and Variations. Using JFrog ML, you can segment traffic, compare different variations, and make data-driven decisions to enhance the performance of your models.

Segmenting traffic with Audiences

Audiences are a powerful tool for categorizing traffic on predefined client-side metadata conditions. Configured at the JFrog ML Account level, they apply to all real-time models within the Account.

By default, requests that do not meet any specific Audience criteria are routed to the fallback Audience.

🚧

Note: Audience details must be included in the metadata or header fields at each inference request.

Creating an Audience

Audiences are created using YAML files and managed via the Qwak CLI.

Below is an example of an Audience that categorizes users from New York aged between 10 and 30

The first condition matches the location field to be new-york. This match has to be exact, it's not a regex.

The second condition is binary, where the age key is in the range specified with first_operand and second_operand.

api_version: v1
spec:
  audiences:
    - name: New-York
      description: Users from New York aged 10-30
      conditions:
        unary:
          - key: location
            operator: UNARY_OPERATOR_TYPE_EXACT_MATCH
            operand: new-york
        binary:
          - key: age
            operator: BINARY_OPERATOR_TYPE_RANGE_MATCH
            first_operand: 10
            second_operand: 30

Each audience definition must include:

  • Name: A display name of the audience used on the JFrog ML platform UI.
  • Description: A general description of the audience.
  • Conditions: A list of conditions with an AND operand between them.

Defining Conditions

The following condition types are available when creating an audience:

Unary operators

  1. UNARY_OPERATOR_TYPE_EXACT_MATCH: Matches an exact value.
  2. UNARY_OPERATOR_TYPE_SAFE_REGEX_MATCH: Matches using a regular expression.
  3. UNARY_OPERATOR_TYPE_PRESENT_MATCH: Checks for the presence of a key.
  4. UNARY_OPERATOR_TYPE_PREFIX_MATCH: Matches values that start with a specified prefix.
  5. UNARY_OPERATOR_TYPE_SUFFIX_MATCH: Matches values that end with a specified suffix.
  6. UNARY_OPERATOR_TYPE_CONTAINS_MATCH: Matches values containing a specified substring.

Binary operator

  1. BINARY_OPERATOR_TYPE_RANGE_MATCH: Checks if a value falls within a specified range.

Registering and managing Audiences

Register an audience from a config file

Apply an audience configuration using a CLI command:

qwak audiences create -f <path_of_audience_config.yaml>

List all existing audiences

Returns a list of audience ids and names.

qwak audiences list

Retrieve the details of a specific audience

To get the audience-id use the qwak audiences list command first, to retrieve all audiences with their IDs.

qwak audiences get --audience-id <audience_id>

📘

Tip

For a full list of available options and parameters, use the command qwak audiences --help.


Directing traffic to Audiences

In the following examples, we demonstrate how to route requests to specific Audiences.

Using the Python Client

from qwak_inference import RealTimeClient

model_id = <your_model_id>
feature_vector = <...>
metadata = {"location": "new-york", "age": 25}

client = RealTimeClient(model_id=model_id)

client.predict(feature_vector, metadata=metadata)

Using REST calls

curl --location --request POST 'https://models.<your_env>.qwak.ai/v1/1_hour_model/predict' \
--header 'Content-Type: application/json' \
--header 'location: new-york' \
--header 'age: 25' \
--header 'Authorization: Bearer <Auth Token>' \
--data '{
  "columns": ["feature_1", "feature_2"],
  "index": [0],
  "data": [[0.0, 0.0]]
}'

🚧

Note: Audience information is not stored in the JFrog ML Analytics Lake. This means that audience information of requests cannot be tracked.


Splitting traffic with Variations

Variations are used for traffic splitting, operating above the Audience level. They have the following key characteristics:

  • Each Variation is linked to a specific model Build.
  • Variations always draw traffic from an Audience, so creating an Audience is required before using Variations.
  • They can allocate traffic by percentage from various audiences or between different Builds, enabling effective A/B testing. Variations can be designated as Shadow Variations, which replicate a percentage of live traffic to other Builds for testing purposes.
  • Each Variation can direct traffic to only one deployed Build at a time.
  • The Variation assigned to a request is recorded in Analytics under the column variation_name.

Assigning traffic to Variations

When deploying a model with multiple variations, audiences are assigned to specific variations, including the fallback audience and a fallback variation.

When distributing traffic from an audience to multiple variations, the total percentage allocated must equal 100% of that audience's traffic.

The only exception is the Shadow Variation, which can receive less than 100% of live traffic.

The currently deployed Variations appear under Traffic Control section in the model overview:

Fallback Variation: Receives traffic that doesn't match any audience.

Connecting Audiences to Variations: Audiences can be linked to one or more variations, with traffic between variations distributed randomly based on defined percentages.

To modify traffic configuration:

  1. Edit the deployment of the desired build.
  2. Adjust the percentage of traffic for each variation.

Enabling Variations with the default Audience

In certain scenarios, you may not require the traditional traffic categorization provided by Audience conditions. For such cases, JFrog ML offers support for a default Audience, which lacks conditions but enables the utilization of Variations for all requests.

Below is an example of configuring the default Audience to enable Variations:

api_version: v1
spec:
  audiences:
    - name: default
      description: All traffic

To register this Audience configuration, please refer to the Audiences section above.

Undeploying a multi-Variation Realtime Model

Once you have more than one build deployed, when you undeploy an existing build, you must specify how to split the traffic after the undeployment.

Undeploying models using the UI

To undeploy a variation:

  1. In the build view, click the options icon next to a deployed build and select Undeploy.
  2. Redistribute the traffic between the remaining variations and then click Undeploy.

Undeploying models using the CLI

To undeploy a model with variation from the CLI, run the following command:

qwak models undeploy \ 
    --model-id <model-id> \
    --variation-name <variation-name> \
    --from-file <config-file-path>

With a configuration file as follows:

realtime:
  variation_name: <The variation name being undeployed>
   audiences:
    - id: <remaining audience id>
      routes:
        - variation_name: <other existing variation>
          weight: 80
          shadow: false
        - variation_name: <other existing variation 2>
          weight: 20
          shadow: false
  fallback_variation: <other existing variation>

If you use --variation-name in the CLI command, you don't have to pass the variation_name in the configuration file.

When undeploying from 2 variations to one, you don't have to pass any variation-related data -all the traffic will pass to the remaining variation.

📘

Note

By default the undeploy command is executed asynchronously, which means that the command does not wait for the undeployment to complete.

To execute the command in sync, use the --sync flag.