Managing Dependencies

JFrog ML supports a variety of Python frameworks to manage model dependencies.

Supported Python Versions

When building and managing your Python projects, different tools have varying levels of support for Python versions. Below is a summary of the supported Python versions for each tool:

  1. Poetry supports Python versions: 3.8 - 3.11
  2. Conda supports Python versions: 3.8 - 3.11
  3. requirements.txt (pip) supports only Python 3.9

Using Poetry with JFrogML

🚧

JFrogML uses Poetry version 1.8.3.

JFrog ML supports poetry.lock files as long as they're under the same scope as the pyproject.toml file.


Model Directory Structure:

qwak_based_model/  
β”œβ”€β”€ main/  
β”œβ”€β”€β”€β”€ pyproject.toml  
β”œβ”€β”€β”€β”€ poetry.lock  
β”œβ”€β”€ tests/

Both files pyproject.toml and poetry.lock will be used by Poetry while executing poetry install cmd.


Example Project Setup

[tool.poetry]
name = "example-project"
version = "0.1.0"
description = "Example project for production and development"
authors = ["Your Name <[email protected]>"]

[tool.poetry.dependencies]
python = ">=3.11,<3.12"
scipy = "^1.7"
scikit-learn = "^0.24"
catboost = "^1.0"

[tool.poetry.dev-dependencies]
qwak-sdk = "*"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"


When specifying dependencies in Poetry, using * as the version for qwak-sdk instructs Poetry to install the latest available version of the qwak-sdk package. This approach ensures that your project always utilizes the most recent features and fixes. However, it's important to consider the implications of automatically adopting new versions, as they may introduce breaking changes or compatibility issues. For more controlled dependency management, consider the following alternatives:

  • qwak-sdk = "^0.5.61": This specifies that Poetry should install a version of qwak-sdk that is at least as new as 0.5.61 but less than the next major version (1.0.0). It allows for updates that include backwards-compatible features and fixes. This approach balances the benefits of receiving updates with the safety of avoiding major changes that could break your project.
  • qwak-sdk = "0.5.61": This pins qwak-sdk to a specific version, ensuring that your project will always use version 0.5.61 of the SDK. This is the safest option if your project depends on the specific behavior of this version, as it eliminates the risk of unexpected changes due to updates. However, it also means that you will not automatically benefit from new features or fixes introduced in later versions.

πŸ“˜

The qwak-sdk dependency is included only in the dev section, as it's needed for local development but not for remote builds. When you run the qwak models build command, the SDK version you used locally will be automatically included in the remote environment.



Using Conda with JFrogML

🚧

JFrogML uses Conda version 24.7.1


Model Directory Structure

qwak_based_model/  
β”œβ”€β”€ main/  
β”œβ”€β”€β”€β”€ conda.yml
β”œβ”€β”€β”€β”€ ...  
β”œβ”€β”€ tests/

The conda.yml file can be placed at the root level alongside main or within itβ€”both structures work equally well.

Example Project Setup

To get started, here’s a basic conda.ymlsetup:

name: example_conda_model
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.11
  - scipy
  - scikit-learn
  - catboost
  - pip:
      - # additional pip dependencies

πŸ“˜

There’s no need to manually add qwak-sdk to the environment. JFrogML’s build process includes it automatically based on your local version.


.qwakignore file

Occasionally, we may want to exclude a file from the JFrog ML build but keep it in the repository with the model code. In such cases, we should add the .qwakignore file to the root directory of our project.

In the file, we define the patterns to match files to exclude from the model build.

For example, suppose we have the following file structure:

.qwakignore
main/
    __init__.py
    model.py
    README.md
tests/
    test_model.py
research/
    paper_a.pdf
    paper_b.pdf

if we want to exclude the entire research directory and the README.md file from the build, our .qwakignore file may contain:

research
README.md

πŸ“˜

Hidden files

By default, JFrog ML disregards hidden files. Hidden files are files or directories whose names start with a dot (.) in Unix-like operating systems, or they may have the "Hidden" attribute set in Windows. These files are typically used to store configuration data or hold temporary information.

Suppose you have a directory with files and subdirectories, including a hidden file named .config_file. JFrog ML, following its default behavior, will exclude this file from processing when triggering a remote build.



Incorporating Python Dependencies from .whl Files

Qwak facilitates the use of Python dependencies packaged as .whl files through requirements.txt and conda.yaml for managing dependencies. It's important to note that Poetry does not support dependencies from .whl files.

  1. Preparing Your .whl Files:

First, ensure your .whl file(s) are either uploaded with your model code or fetched from external storage. For instructions on uploading additional dependencies, refer to the Qwak CLI documentation (qwak models build --help). Below is an example directory structure for your model, where main is uploaded by default and the dep directory, containing the pandas dependency in a .whl file, is included via the --dependency-required-folders dep option in the Qwak command.

The wheel file has to be uploaded as part of additional dependencies folder, and not as part of the main model folder.

/qwak/model_dir/
.
β”œβ”€β”€ main                   # Main directory containing core code
β”‚   β”œβ”€β”€ __init__.py        # An empty file that indicates this directory is a Python package
β”‚   β”œβ”€β”€ model.py           # Defines the Credit Risk Model
β”‚   └── conda.yaml         # Conda environment configurationdata
β”‚ 
β”œβ”€β”€ dep                   # Additional dependency directory added with --dependency-required-folders
β”‚   └── pandas-2.2.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
β”‚ 
β”œβ”€β”€ tests                  # Empty directory reserved for future test 
β”‚   └── ...                # Future tests
|
└── 
  1. Configuring Dependency Management Files:

Conda: Include the .whl file in your conda.yaml as follows:

name: test_model
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.9
  - pip:
    - "/qwak/model_dir/dep/pandas-2.2.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl"

Poetry

[tool.poetry]
name = "example-project"
version = "0.1.0"
description = "Example project for production and development"
authors = ["Your Name <[email protected]>"]

[tool.poetry.dependencies]
python = "^3.9"
scipy = "^1.7"
scikit-learn = "^0.24"
catboost = "^1.0"
pandas = { path = 'dep/pandas-2.2.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl' }

[tool.poetry.dev-dependencies]
qwak-sdk = "*"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"


Requirements.txt: Directly reference the .whl file path relative to the requirements file location:

# requirements file located in main model folder
./../deps/wheel_test-0.1-py3-none-any.whl

# requirements file located in model dir
./deps/wheel_test-0.1-py3-none-any.whl
  1. Using the Dependency in Your Code:
    Once the dependency is properly configured, you can import and use it in your Python code as usual:
import pandas as pd