Using PyTorch with CUDA
When working with PyTorch on GPU instances, it's crucial to ensure that your library installation is compatible with the CUDA drivers installed on Qwak instances. This ensures optimal performance and compatibility with GPU resources.
Currently the Qwak GPU instances are provisioned with CUDA version 12.1 and below you will find instructions on using the latest versions of Torch compatible with the CUDA mentioned above.
Installing Compatible PyTorch
To align PyTorch with the CUDA version on your instance, use the following index URL when adding the pytorch
library to your dependencies configuration file, whether it's Conda, Pip (requirements.txt), or Poetry.
In Workspaces
Use this command in your workspace environment:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
In Model Builds
For requirements.txt
, your file should look like this:
scipy
scikit-learn
pandas
--extra-index-url https://download.pytorch.org/whl/cu121
torch
torchvision
torchaudio
For Conda environments, here's an example configuration:
name: your-conda-environment
channels:
- defaults
- conda-forge
- huggingface
dependencies:
- python=3.9
- pip:
- --extra-index-url https://download.pytorch.org/whl/cu121
- torch
- torchvision
- torchaudio
- transformers
- accelerate
- scikit-learn
- pandas
Please note that the conda.yaml
above is just an example, not all the dependencies are required.
Verifying the installation
After installation, confirm that PyTorch
is utilizing the GPU. Add the following code snippet to your QwakModel
. For training models, insert it at the start of the build()
method. If loading a pre-trained model, place it in the initialize_model()
method.
import torch
print("Torch version:",torch.__version__)
# Automatically use CUDA if available, else use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"The PyTorch device used by the model is {device}\n")
This should output cuda
as device in your Qwak model build logs, indicating that PyTorch is correctly set up to use the GPU.
Troubleshooting
If you don't see True
in your logs, check the Code tab within your Build page. Ensure that the dependency file is correctly recognised by the model and that the requirements.lock
file reflects the appropriate versions for the Torch libraries.
Updated 5 months ago