The ultimate guide on installing PyTorch with CUDA support in all possible ways

→ Using Pip, Conda, Poetry, Docker, or directly on the system

We all know that one of the most annoying things in Deep Learning is installing PyTorch with CUDA support.

Nowadays, installing PyTorch & CUDA using pip or conda is relatively easy. Unfortunately, many production projects require you to use Poetry or Docker. That is where things get more complicated.

That is why I am writing this article as a practical living document showing how to install these 2 beasts in all possible ways.

This tutorial is a living document that I plan to use to install PyTorch & CUDA myself. Thus, I will update this doc whenever I test something I like. Also, in the comments section, feel free to add any other methods you use to install torch & CUDA or troubleshoot potential issues. Let’s create the go-to document that makes installing PyTorch & CUDA a piece of cake!

Important observation: I am mainly using Ubuntu. Thus, I will use concrete examples based on it. But this article can easily be extrapolated to other operating systems.

Another important observation: I have used Python 3.10, torch 2.0.1 and CUDA 11.8 in most examples. Feel free to change it with your required versions. You can find them on PyTorch's main page.

Useful Concepts
System
Pip
Conda (or Mamba)
Poetry
Docker
Test out the installation
Troubleshooting

#1. Useful Concepts

CUDA

CUDA (Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) created by NVIDIA. It allows developers to use a CUDA-enabled graphics processing unit (GPU) for general-purpose processing.
The CUDA Toolkit includes libraries, debugging and optimization tools, a compiler, and a runtime library to deploy your applications on GPU-accelerated platforms.

CuDNN

cuDNN is a GPU-accelerated library for deep neural networks. It provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers.
Deep learning frameworks like TensorFlow and PyTorch use cuDNN under the hood to accelerate deep learning computations.

PATH Environment Variable

Purpose: It is used to specify a set of directories where executable programs are located. When you type a command in the terminal, the shell searches these directories for an executable file with the command's name.
Usage: It’s common to modify the PATH variable when you install new software and want to make its executables available from the command line without specifying the full path, for example, adding the directory of a newly installed compiler or a script.
Format: It’s a list of directories separated by a colon (:) on Unix-like systems. For example: “/usr/local/bin:/usr/bin:/bin”.

Environment Variable

Purpose: It is used by the dynamic linker/loader to find shared libraries (.so files in Linux). When an executable starts, it might depend on shared libraries for certain functionalities, and LD_LIBRARY_PATH specifies where to look for these libraries.
Usage: This is particularly useful when you have multiple versions of a library and you want to specify which one should be used by applications or when libraries are installed in non-standard locations.

#2. System

I don’t like this approach, as it is similar to installing global Python dependencies. Eventually, your CUDA versions between multiple projects will clash, and hell will unleash.

In that scenario, you will need to install & manage multiple CUDA versions customly.

You have to download CUDA version x.x.x from here. After, move all the files to your local directory (e.g., “/usr/local” on Linux-based systems).

Then, within your project (or “.bashrc”, “.zshrc” files), you have to attach to the PATH environment variable the paths to your CUDA binaries (e.g., “/usr/local/cuda/bin”) and to LD_LIBRARY_PATH the paths to your CUDA libraries (e.g., “/usr/local/cuda/lib”).

Note: This section is treated more superficially, as I don’t like it and haven’t used it in the past 5 years.

#3. Pip

The prettiest scenario is when you can use pip to install PyTorch. In the latest PyTorch versions, pip will install all necessary CUDA libraries and make them visible to your other Python packages.

pip install torch==2.0.1

…and that’s it!

More concretely, along with torch, it will install all the “nvidia-*” python packages that contain all the necessary CUDA libraries, such as:

nvidia-cuda-runtime-cu11==11.7.99
nvidia-cudnn-cu11==8.5.0.96
nvidia-cufft-cu11==10.9.0.58
...

If you are using older PyTorch versions or can’t use pip, check out the Poetry “Manually install all CUDA dependencies” section, where you will see how to install & expose all CUDA dependencies manually (making abstraction of the poetry stuff).

#4. Conda (or Mamba)

Some people prefer Mamba over Conda. The good news is that Mamba kept the same interface as Conda. Thus, the commands below apply to both — just replace conda with mamba.

First, create a new virtual environment and activate it:

conda create --name py310-cu118 python=3.10
conda activate py310-cu118

Now install the CUDA toolkit and PyTorch:

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

…and that’s it!

In many cases, installing PyTorch with the “cudatoolkit” is sufficient as it installs all necessary CUDA components, which includes the cuDNN libraries. You might only need to install “cudnn” explicitly for specific version requirements or advanced use cases. You can do it as follows:

conda install cudnn=8.6.0

Update LD_LIBRARY_PATH manually [optional]

If, for any reason, you are a masochist like me and you must update the LD_LIBRARY_PATH environment variable manually with the paths pointing to CUDA, this is your lucky day.

First, install a Python helper package:

pip install nvidia-cudnn-cu11==8.6.0.163

IMPORTANT: Run the commands below while your shell has your required conda virtual environment activated: conda activate ...

Then, to update the LD_LIBRARY_PATH run the following:

mkdir -p $CONDA_PREFIX/etc/conda/activate.d

echo 'CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo "export OLD_LD_LIBRARY_PATH=$LD_LIBRARY_PATH" >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
echo 'export LD_LIBRARY_PATH=$CONDA_PREFIX/lib/:$CUDNN_PATH/lib:$LD_LIBRARY_PATH' >> $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh

Finally, to clean your LD_LIBRARY_PATH when you deactivate the conda environment, run the following:

mkdir -p $CONDA_PREFIX/etc/conda/deactivate.d

echo "unset LD_LIBRARY_PATH" >> $CONDA_PREFIX/etc/conda/deactivate.d/env_vars.sh
echo "export LD_LIBRARY_PATH=$OLD_LD_LIBRARY_PATH" >> $CONDA_PREFIX/etc/conda/deactivate.d/env_vars.sh

Note: If you need access to other CUDA libraries than CUDNN, check out Poetry‘s “Manually install all CUDA dependencies” section.

#5. Poetry

Poetry is a spectacular dependency manager for Python. Unfortunately, when installing Poetry with CUDA support things get a little trickier.

The good news is that once you get it, it is straightforward.

The first step is to point Poetry to the right version of Python (e.g., Python 3.10):

poetry env use $(which python3.10)

Otherwise, Poetry will use, by default, your system’s Python version.

Note: You can have multiple Python versions directly on your system or use pyenv or conda to manage multiple Python versions on your system.

As “supplemental” Poetry source

First, we add PyPi as the main repository:

poetry source add PyPi

Then, we add “https://download.pytorch.org/whl/cu118” as an optional repository under the name of “torch” (it can have any other name):

poetry source add --priority=supplemental torch https://download.pytorch.org/whl/cu118

By doing so, we can explicitly install torch with CUDA support from the “torch” repository:

poetry add torch==2.0.1+cu118 --source torch

Note: PyPi will be used every time you install a Python package with Poetry unless you specify a different source — as we did when installing “torch”.

Now, your “pyproject.yaml” file will look as follows:

[tool.poetry.dependencies]
python = ">=3.10,<3.12"

...

torch = {version = "2.0.1+cu118", source = "torch"}

...

[[tool.poetry.source]]
name = "PyPi"
priority = "default"

[[tool.poetry.source]]
name = "torch"
url = "https://download.pytorch.org/whl/cu118"
priority = "supplemental"

Install directly from a wheel

You can install the suitable torch Python package directly from its wheel to avoid adding supplemental Poetry sources.

poetry add "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp310-cp310-linux_x86_64.whl"

Now, your “pyproject.toml” file should contain something like this:

torch = {url = "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp310-cp310-linux_x86_64.whl", python = ">=3.10 <3.11"}

Note: You can also directly add the line above in your “pyproject.toml” file and recreate the .lock file as follows: poetry lock

You can navigate here to find other PyTorch wheels.

Different platforms

Leveraging the strategy from above, you can install various types of torch for different platforms. Such as CUDA-enabled torch when using Linux and CPU-based torch when using darwin:

poetry add "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp310-cp310-linux_x86_64.whl" --platform linux
poetry add "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp310-cp310-linux_x86_64.whl" --platform darwin

Now, your “pyproject.toml” file should contain something like this:

torch = [
    {url = "https://download.pytorch.org/whl/cu118/torch-2.0.0%2Bcu118-cp310-cp310-linux_x86_64.whl", platform = "linux", python = ">=3.10 <3.11"},
    {url = "https://download.pytorch.org/whl/cpu/torch-2.0.0-cp310-none-macosx_11_0_arm64.whl", platform = "darwin", python = ">=3.10 <3.11"},
]

Manually install all CUDA dependencies

Note: This method can be extrapolated to any other installation method of PyTorch & CUDA.

Unfortunately, when installing torch with CUDA support through Poetry, it installs only the CUDNN & runtime libraries by default.

That will do the job most of the time, but some packages, such as “bitsandbytes” need access to other CUDA libraries, such as “nvidia-cublas-cu11”.

You can install the whole CUDA stack inside a Poetry environment by manually installing all the “nvidia-*” Python packages — similar to what pip is doing under the hood. Run the code below in a terminal at the same level as your “pyproject.toml” file:

# Define the dependencies
dependencies=(
    "nvidia-cuda-nvrtc-cu11==11.7.99"
    "nvidia-cuda-runtime-cu11==11.7.99"
    "nvidia-cuda-cupti-cu11==11.7.101"
    "nvidia-cudnn-cu11==8.5.0.96"
    "nvidia-cublas-cu11==11.10.3.66"
    "nvidia-cufft-cu11==10.9.0.58"
    "nvidia-curand-cu11==10.2.10.91"
    "nvidia-cusolver-cu11==11.4.0.1"
    "nvidia-cusparse-cu11==11.7.4.91"
    "nvidia-nccl-cu11==2.14.3"
    "nvidia-nvtx-cu11==11.7.91"
)

# Loop through each dependency and install it using Poetry
for dep in "${dependencies[@]}"
do
    poetry add "$dep"
done

Note: To find out the right “nvidia-*” versions for a given torch version, you can install in a dummy environment your required torch version with pip, which will automatically install all the matching “nvidia-*” packages and copy the versions from there.

After, you can expose only x1 nvidia package (e.g., “nvidia-cudnn-cu11”) with the following code:

# Export CUDNN libraries.
export CUDNN_PATH=$(dirname $(poetry run python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)"))
export CUDA_RUNTIME_PATH=$(dirname $(poetry run python -c "import nvidia.cuda_runtime;print(nvidia.cuda_runtime.__file__)"))
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${CUDNN_PATH}/lib:${CUDA_RUNTIME_PATH}/lib

Or expose all of them using the following bash function:

#!/bin/bash

update_ld_library_path() {
    local NVIDIA_DIR=$(dirname $(dirname $(poetry run python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)")))
    local updated_ld_library_path="${LD_LIBRARY_PATH}"

    for item in "$NVIDIA_DIR"/*
    do
        if [ -d "$item" ]; then
            updated_ld_library_path="${updated_ld_library_path}:${item}/lib"
        fi
    done

    echo "${updated_ld_library_path}"
}

…which can be called in your Makefile as follows (note that in doing so, we inject the specific updated LD_LIBRARY_PATH only to the training process, avoiding altering it globally):

train:
 @echo "Training..."
 
 $(eval NEW_LD_LIBRARY_PATH := $(shell . ./init.sh; update_ld_library_path))
 @echo Updated LD_LIBRARY_PATH: $(NEW_LD_LIBRARY_PATH)

 LD_LIBRARY_PATH=${NEW_LD_LIBRARY_PATH} poetry run python -m tools.train ...

or in a standard bash script:

#!/bin/bash

# Source the script to make the function available
. /path/to/your/script.sh

# Call the function and capture its output
NEW_LD_LIBRARY_PATH=$(update_ld_library_path)

echo "Updated LD_LIBRARY_PATH: $NEW_LD_LIBRARY_PATH"

LD_LIBRARY_PATH=${NEW_LD_LIBRARY_PATH} poetry run python -m tools.train ...

Note: I would avoid adding the code above in your “.bashrc” or “.zshrc” files, as these dependencies are coupled to your project, not your system. Thus, you don’t want to make them globally available.

#6. Docker

Installing your project inside a Docker container is kind of orthogonal to the methods mentioned above… emphasizing the word “kind of”.

You can still use pip, conda, or poetry to manage your Python dependencies, but installing torch with CUDA support inside a Docker container requires 2 additional steps.

Step 1: Use a CUDA image from Nvidia:

FROM nvidia/cuda:11.3.1-runtime-ubuntu20.04

...

COPY . /app

RUN chmod +x /app/deploy/entrypoint.sh

CMD ["bash", "/app/deploy/entrypoint.sh"]

Step 2 [Optional]: Now, depending on the method you use to install your dependencies inside the Docker image, you might need to point the Python process to the necessary CUDA libraries.

You can use the update_ld_library_path() bash function (described in the “Manually install all CUDA dependencies” Poetry section) inside your Docker “entrypoint.sh” script:

#!/bin/bash

# Source the script to make the function available
. /path/to/your/script.sh

# Call the function and capture its output
NEW_LD_LIBRARY_PATH=$(update_ld_library_path)

# Echo the updated LD_LIBRARY_PATH
echo "Updated LD_LIBRARY_PATH: $NEW_LD_LIBRARY_PATH"

# Check if PYTHON_MODULE is set and not null
if [ -z "$ML_PIPELINE_PYTHON_MODULE" ]; then
    echo "The ML_PIPELINE_PYTHON_MODULE environment variable is not set or is null. Exiting."

    exit 1
fi

echo "Running 'poetry run python -m ${ML_PIPELINE_PYTHON_MODULE} ${PYTHON_ARGS}'"
LD_LIBRARY_PATH=${NEW_LD_LIBRARY_PATH} poetry run python -m ${ML_PIPELINE_PYTHON_MODULE} ${PYTHON_ARGS}

#7. Test out the installation

Finally, here are some helpful utility functions to check that everything is installed correctly.

Check if CUDA is available:

python -c "import torch; print(torch.cuda.is_available())"
# e.g., True/False

See how many CUDA devices you have available:

python -c "import torch; print(torch.cuda.device_count())"
# e.g., 2

Get the CUDA device name based on its index:

python -c "import torch; print(torch.cuda.get_device_name(0))"
# e.g., NVIDIA GeForce RTX 3060

Get the current CUDA device index:

python -c "import torch; print(torch.cuda.current_device())"
# 0

Get a reference to an abstraction over a CUDA device using its index:

python -c "import torch; print(torch.cuda.device(0))"
# <torch.cuda.device object at 0x7f199b566140>

#8. Troubleshooting

I highlighted some issues that frequently happen in case something goes wrong (which usually does).

Check the compatibility between the versions of the following components:

PyTorch <-> CUDA
PyTorch <-> cudnn
CUDA <-> your GPU

Other possible issues:

Ensure that your GPU drivers are up to date.

Conclusion

There are many ways to install PyTorch & CUDA successfully.

Nowadays, installing PyTorch with CUDA support with pip or conda is relatively straightforward.

Unfortunately, many projects require you to use Poetry and Docker to install PyTorch & CUDA, where things get more complicated.

I hope this tutorial will bring some light at the end of the tunnel when installing various Deep Learning projects.

Note: If you know any other methods that worked well for you or wish to highlight other troubleshooting techniques, please explain them in the comments, and I will add them to this tutorial (with links to your socials to give you the proper credit). Let’s make installing PyTorch & CUDA easy!

💡 My goal is to help machine learning engineers level up in designing and productionizing ML systems. Follow me on LinkedIn or subscribe to my weekly newsletter for more insights!

🔥 If you enjoy reading articles like this and wish to support my writing, consider becoming a Medium member. By using my referral link, you can support me without any extra cost while enjoying limitless access to Medium’s rich collection of stories.

Join Medium with my referral link - Paul Iusztin

As a Medium member, a portion of your membership fee goes to writers you read, and you get full access to every story…

pauliusztin.medium.com

Thanks ✌️

References:

CUDA Quick Start Guide