Astral uv sped up my HPC workflows

Author

Aron van de Pol

Published

December 6, 2024

Back in 2020, I dove into Python during the pandemic, taking online Digital Humanities courses in the Netherlands. Those late nights of coding were simpler times. I was blissfully naive about package management, installing everything through Jupyter notebooks into my base environment. If it worked on my machine, it would work anywhere, I assumed? The experienced programs, developers and maintainers must have thought carefully about all this. Just run those !pip install ... cells, and you’re good to go!

The Evolution of My Python Needs

As my work grew more demanding, I faced new challenges: - My laptop couldn’t handle the computational load - I shifted from notebooks to scripts - Code needed to run across multiple machines (laptop, desktop, and Leiden University’s ALICE HPC Cluster)

I still use notebooks a LOT for testing or teaching. They are great for that.

This led to the classic Python environment nightmare, well captured by this XKCD comic:

The problem with Python… - xkcd.com

The situation was already complex on my laptop alone. Add in my desktop and ALICE HPC, and it became a management nightmare. Sure, there were tools like pip, conda with requirements.txt, and Poetry, but they all did not work as flawlessly as I wanted, or required significant setup time — time I’d rather spend coding.

uv: Lightning-Fast Package management

Recently, I discovered uv by Astral (the same people behind Ruff, which I already use and love). Not only did uv solve my environment management headaches, but it’s fast. While pip or conda might take minutes to resolve and install dependencies, uv accomplishes the same task faster.

uv solved two major pain points for me:

  1. Running code quickly on any device without venv hassles through its Scripts concept
  2. Managing larger projects efficiently through its Project concept

The Scripts Concept

Sometimes you just need to run a quick script with a few dependencies. No need for another venv cluttering your system. Here’s how uv’s scripts concept transforms this:

Traditional Approach (requiring pandas in your environment):

import pandas as pd
df = pd.read_csv('test.csv')

uv Scripts Approach:

# /// script
# dependencies = [
#   "pandas",
# ]
# ///

import pandas as pd
df = pd.read_csv('test.csv')

Just run uv run test.py, and uv handles everything - downloading and installing dependencies in mere seconds before executing your code. This has been particularly valuable for running scripts on the HPC cluster. In SLURM scripts, I can simply use uv run {script name} instead of python {script name}.

This is a brief example, but see more at the script docs

The Project Concept

For larger projects, uv’s project management works through pyproject.toml. Here’s a basic example:

[project]
name = "printshop-attrimil"
version = "0.1.0"
description = "project for printshop"
readme = "README.md"
requires-python = ">=3.8,<3.10"
dependencies = []

Managing dependencies is straightforward and lightning-fast:

  • Add packages with uv add ...
  • Remove them with uv remove ...

After adding dependencies, your pyproject.toml might look like this:

[project]
name = "printshop-attrimil"
version = "0.1.0"
description = "project for printshop"
readme = "README.md"
requires-python = ">=3.8,<3.10"
dependencies = [
    "torch>=2.5.1",
    "torchvision",
    "pytorch-lightning",
    "tensorboard",
    "pillow",
    "numpy<2.0",
    "tqdm",
    "matplotlib",
    "scikit-learn",
    "albumentations",
    "opencv-python",
    "timm",
    "transformers",
    "einops",
    "umap-learn",
    "rich",
    "requests",
    "comet_ml",
    "llvmlite==0.36.0",
    "tensorflow>=2.7.4",
]

uv automatically handles compatibility checks between packages and your Python version, making dependency management much more reliable. The speed difference becomes noticeable with complex dependency trees like this. What might take pip several minutes to resolve and install, uv accomplishes in a fraction of the time.

For more on the concept of Project

Python Versions

One more powerful feature of uv is its seamless Python version management. Unlike traditional approaches that require separate tools like pyenv or conda, UV handles Python versions automatically. You don’t even need Python installed on your system to get started — uv will fetch and manage Python versions as needed.

The basics are incredibly simple:

# Install the latest Python version
uv python install

# Install specific Python version(s)
uv python install 3.12

# Or multiple versions at once
uv python install 3.11 3.12

When using uv’s scripts or projects, it automatically downloads the required Python version based on your specifications. This is particularly valuable when often moving between various devices.

Again for more see the uv python documentation

The Impact on My Workflow

Moving to uv has taken over how I manage Python environments across my machines. I spend less time wrestling with conflicting dependencies or waiting minutes for package installations. Whether I’m prototyping on my laptop, running computations on my desktop, or scaling up to the ALICE HPC cluster.

For anyone working across multiple machines or dealing with HPC environments, consider uv!

There is a lot more too uv such as tools or publishing packages, but for my workflow, this has already made a difference.