avatarRemo Hoeppli

Summary

The article discusses automating Python package versioning using pyproject.toml and Git, with insights into Semantic Versioning and Python-specific versioning practices.

Abstract

The article delves into the significance of software versioning, emphasizing the importance of a robust system to distinguish between different codebase iterations. It explores the adoption of Semantic Versioning (SemVer) in Python, highlighting the compatibility and discrepancies between SemVer and Python's versioning specifications. The author provides detailed examples to illustrate valid version identifiers for both systems and introduces a method for handling development versions effectively. The core of the article focuses on leveraging setuptools and setuptools-git-versioning to automate version management in Python projects, demonstrating how to configure pyproject.toml for seamless versioning. This approach is particularly beneficial for continuous integration and deployment (CI/CD) workflows, ensuring consistent and informative version tags derived from Git repository data.

Opinions

  • The author advocates for the use of Semantic Versioning in Python projects due to its widespread adoption and compatibility with other programming languages, facilitating consistent versioning across diverse codebases.
  • Manual versioning is seen as cumbersome, especially when dealing with multiple feature branches, reinforcing the need for an automated solution that can handle complex development cycles.
  • The author suggests that the flexibility of the metadata string in version tags can compensate for the lack of full SemVer compliance in Python, allowing developers to convey pre-release and development stage information.
  • The use of setuptools-git-versioning is presented as a practical solution to automate versioning, with the added benefit of integrating with CI/CD pipelines for smoother package management and deployment processes.
  • The author acknowledges potential pitfalls with the commit counter behavior in setuptools-git-versioning, suggesting that future enhancements could address these issues to avoid inconsistent version numbering.

Put Your Python Package Versions on Autopilot with Pyproject.toml and Git

The Python Project — Put Your Python Package Versions on Autopilot With Pyproject.toml And Git

TL;DR

Versioning your code can help you to distinguish between different versions and introduce meta information to better understand the evolution of your application. This article covers the compatibility between Semantic Versioning and version specifiers for Python as documented in the PyPA specs, and shows how the versioning of your Python code can be automated using setuptools and setuptools-git-versioning.

Table of Contents

· TL;DR · Table of Contents · The Hustle with Manual Software Versioning · What About Semantic Versioning? · Python and Semantic VersioningPython Version SpecifiersPython and Semantic Versioning Summarized · Making Use of the Metadata String · Configuring the Pyproject.toml for Automated VersioningHow setuptools-git-versioning works · Final Words · About the Author

The Hustle with Manual Software Versioning

Every software developer understands the importance of implementing a robust versioning system for their code. In the context of this article, versioning primarily refers to assigning a version number or tag to your codebase, distinct from version control within a Git repository. The main goal of a good versioning, is to know whether you are running a production or development version, as well as telling apart two versions to see which one might be a more developed version than the other. It helps to get an understanding of the evolution of your application.

Handling release versions, in general, isn’t such a big deal because it is in essence a counter that is increased to describe newer versions. The higher the version number becomes, the more evolved the application gets. At least, in theory, that should be the case. Occasionally, we might come across a release that is buggy and seems worse than the previous one, but it should be a rare case if a good project management is in place and code is properly tested.

What is a bit more tricky, is the handling of development versions. A reason for this, is that during the development, there might be different feature branches, that are developed in parallel, so increasing the version counter just doesn’t cut it. There is also a need to tell apart the versions by feature and, in the best case, have another counter to indicate how far the code in a certain feature is evolved. To briefly summarize it, we need an easy to understand yet powerful way to put version labels on our code. It should take all these special cases into account and help us to compare two, or more, different states of our application.

This all might sound a bit abstract, but I will try to further elaborate on the topic and bring in some examples for better clarification.

What About Semantic Versioning?

One of the widest adopted versioning systems, might undoubtedly be Semantic Versioning or SemVer in short. The most recent version of Semantic Versioning is currently 2.0.0, which is already a good example to see how Semantic Versioning scheme looks like. In general, a version number in Semantic Versioning consists of three different integers separated by dots.

The summary of the Semantic Versioning documentation describes it simple and straightforward:

Given a version number MAJOR.MINOR.PATCH, increment the:

1. MAJOR version when you make incompatible API changes 2. MINOR version when you add functionality in a backward compatible manner 3. PATCH version when you make backward compatible bug fixes

Additional labels for pre-release and build metadata are available as extensions to the MAJOR.MINOR.PATCH format.

To gain a comprehensive understanding of Semantic Versioning in detail, one of the easiest methods is to explore the examples provided on regex101. These examples offer practical illustrations that can help clarify the nuances and conventions of Semantic Versioning. Included are valid as well as invalid examples that point out the dos and don'ts.

Python and Semantic Versioning

Why don’t we adopt Semantic Versioning for Python if it’s so effective? The straightforward answer is that we can indeed use it for Python. However, the more nuanced explanation is that while parts of Semantic Versioning are compatible with the Python versioning definition, not every type of version string falls into this compatibility.

The Python programming language has a detailed documentation about how versioning should be done.

In these guidelines, we also find one part that discusses compatibility with Semantic Versioning.

The “Major.Minor.Patch” (described in this specification as “major.minor.micro”) aspects of semantic versioning (clauses 1–8 in the 2.0.0 specification) are fully compatible with the version scheme defined in this specification, and abiding by these aspects is encouraged.

Semantic versions containing a hyphen (pre-releases — clause 10) or a plus sign (builds — clause 11) are not compatible with this specification and are not permitted in the public version field.

One possible mechanism to translate such semantic versioning based source labels to compatible public versions is to use the .devN suffix to specify the appropriate version order.

Specific build information may also be included in local version labels.

Again, this sounds a bit abstract, so let’s further clarify the compatibility between the Python versioning and SemVer in more detail using some examples.

Python Version Specifiers

There are two possible version identifiers in Python, that partially work with the Semantic Versioning specification. These are public version identifiers and local version identifiers.

The public version identifier is supposed to look like this:

[N!]N(.N)*[{a|b|rc}N][.postN][.devN]

The local version identifier looks like this:

<public version identifier>[+<local version label>]

So let’s check some examples of version tags that are compliant for either both (Python — public version identifiers and SemVer 2.0.0), one, or the other versioning systems:

# Public version identifiers compatible with SemVer 2.0.0 and Python

1.2.3


# Public version identifiers compatible with SemVer 2.0.0 only
# check with https://regex101.com/r/Ly7O1x/3/

1.2.3-alpha.1
1.2.3-beta.1
1.2.3-rc.1
1.1.2-prerelease


# Public version identifiers compatible with Python only
# check with https://regex101.com/r/SbZb6I/1

2024.1
1.2.3a1
1.2.3b1
1.2.3rc1
1.2.3.post1
1.2.3a1.post1
1.2.3.dev1
1.2.3a1.dev1
1.2.3.post1.dev1

This means, using the public version identifiers, doesn’t really give us many options but the normal tagged versions. Luckily, we do have some more options to using the local version identifiers, which go well with the Semantic Versioning metadata notation. This comes in especially handy during the development cycle, since it allows us to craft the following version strings.

# Local version identifiers compatible with SemVer 2.0.0 and Python

1.2.3+3.some.text.goes.here.1


# Public version identifiers compatible with SemVer 2.0.0 only

1.0.0-alpha.1+2.some.text.goes.here.1



# Public version identifiers compatible with Python only

0.1.3a1+1.some.text.goes.here.1

Python and Semantic Versioning Summarized

To summarize the whole story: In situations where compatibility with the Semantic Versioning and Python versioning specification is needed, we do have the following options of valid version strings:

1.2.3
1.2.3+meatadatastring

# where metadatastring can include
# - ASCII letters ([a-zA-Z])
# - ACSII digits ([0-9])
# - periods (.)

To clarify it, we could also go with the normal Python version specification, but given I do work on projects that combine parts of Python as well as other programming languages, I find it good to stick with Semantic Versioning. It helps to specify a version number that is consistent across all code parts.

Making Use of the Metadata String

Since the pre- and post-releases, aren’t cross compatible in Semantic Versioning and Python versioning, we need to get creative with the metadata string. The good thing is, that it offers quite a bit of flexibility. In my case, I try to aim for the following versioning and naming concept:

# Final release versions

1.2.3


# Pre-Release versions

1.2.3+alpha.1
1.2.3+beta.1
1.2.3+rc.1


# Dev-Release versions (feature independent)

1.2.3+dev.1


# Dev-Release versions (feature dependent)

1.2.3+3.some.feature.name.1

# where the metadata string is built in this way
# <branch_name><version_counter>
# in the case above this would mean
# branch_name = 3-some-feature-name
# counter = 1

While this concept, doesn’t make proper usage of the pre- and post-releases of Semantic Versioning, it also doesn’t conflict with the Semantic Versioning nor Python versioning guidelines. Moreover, it gives us valid number tags that to contain all the information needed, to distinguish between different versions and identify the more advanced one.

So far, so good. But how can we automate this process to make versioning management seamless?

As always, we achieve this using the appropriate tools, namely setuptools and setuptools-git-versioning.

Configuring the Pyproject.toml for Automated Versioning

For the scope of this article, I set up a git repository with a demo project. My repository looks as follows:

.
├── README.md
├── VERSION
├── my_demo_app
│   ├── __init__.py
│   ├── some code files
│   └── main.py
└── pyproject.toml

Important for us are the two files VERSION and pyproject.toml with the following content:

# pyproject.toml

[build-system]
requires = [
    "setuptools",
    "setuptools-git-versioning<2"
]
build-backend = "setuptools.build_meta"

[project]
name = "my_demo_app"
description = "A demo app to show automated versioning."
authors = [
    {name = "Remo Hoeppli", email = "[email protected]"},
]
readme = "README.md"
requires-python = ">=3.10"
license = {text = "GPL-3.0-or-later"}
dependencies = [
]
dynamic = ["version"]

[tool.setuptools.packages]
find = {namespaces = false}

[tool.setuptools-git-versioning]
enabled = true
version_file = "VERSION"
count_commits_from_version_file = true
template = "{tag}" # default setting
dev_template = "{tag}+{branch}.{ccount}"
dirty_template = "{tag}+{branch}.{ccount}"

The important steps are, to add the setuptools-git-versioning<2 option to the [build-system][requires] option, add the version to the [project][dynamic] settings to let setuptools know that the version tag should be dynamically computed, and to add the [tool.setuptools-git-versioning] section.

In the [tool.setuptools-git-versioning] section we define the name of our version file, that is, you guessed it VERSION. We enable that the commit counter (ccount), which counts the commits since our VERSION file or the last tag it can find in the git graph. And last, we define what our versioning should look like. Further information on how the setuptools-git-versioning works and how it is configured can be found in the official documentation page, or in the source code repository on GitHub.

The second file that is of importance, is the VERSION file, which only contains a version tag.

# VERSION

0.1.1

How setuptools-git-versioning works

To best describe how the configuration works, let’s assume a standard development process.

We are working on our project, and the next release version will be 1.0.0. So what we do is, we change our VERSION file to contain this new version string and commit these changes into our dev branch.

After this is done, we start to work on our first feature for the new release and create a new branch called 10-my-new-feature. If we wanted to install the version of our new branch locally, we could use the command pip install . from the root of our git repository. This will create the following version for our package 1.0.0+10.my.new.feature.0. Let’s take a look at this, step by step. Given our template {tag}+{branch}.{ccount}, the string was constructed from a tag, which is the number from the VERSION file (because there is no Git tag to consider), the branch, which was constructed from 10-my-new-feature, and the commit counts since our last commit to the VERSION file. If we make some code changes, commit and push them to the Git repository, the next version string would be 1.0.0+10.my.new.feature.1.

One important thing to mention is the commit counter, which offers two functionalities that could lead to a bit of confusion in certain scenarios. The first functionality is to count the commits in your Git graph since the last commit with changes to the VERSION file, the second functionality, as mentioned above, is the count since the last Git tag in the Git graph.

So assume you have already merged two feature branches into your dev branch. This means your commit counter in a newer feature branch originating from dev would not start from 0 because the last commit to the version file was already multiple commits in the past. Furthermore, if we then set a development tag like 1.0.0+alpha.1 onto the latest commit in our dev branch, this would reset the counter in our feature branch if it was originated from the tagged commit. In certain situations, this could then lead to inconsistent numbering in your development versions.

Behavior of the commit counter
Behavior of the commit counter with a Git tag

If we checked out the tag 1.0.0+alpha.1 and install this version using pip install . it would install with the version 1.0.0+alpha.1., given it was constructed from the template {tag} that directly uses the Git tag. In the same way, we could finish our development on the current release, merge our dev branch into our main branch, and create a tag 1.0.0. Installing this tag would then result in the version 1.0.0. After this, we would start all over again, changing our VERSION file to 1.0.1 and further work on our application.

Final Words

In this article, I explained how you can put your Python package versioning on autopilot to create automated version numbers for your application builds. Using this way of versioning your code, can be especially useful when working with CI/CD and automated code builds, since it can automatically create version tags for your application from information available in your Git repository. I hope, I was able to explain the configuration and behavior in a way that you can apply it to your repository as well. As mentioned, there are some pitfalls that should be pointed out to avoid inconsistent numbering. At a time of writing, there is no way to configure the commit counter to not reset when a Git tag is added. However, this could be a nice addition to the setuptools-git-versioning package, since it would enhance the configuration possibilities and could avoid the issues mentioned.

In a future blog article, I will explain how you can use this automated versioning in a CI/CD scenario to automatically build and upload your package to a GitLab Python repository. So make sure to follow me and get informed on new articles on Medium. Furthermore, please let me know if you have any questions and thoughts on this topic, or if you find that I missed any important point. If you know people who are interested in these kinds of articles, sharing it with them would mean the world to me. Anyway, thank you so much for reading and see you next time.

About the Author

Remo Höppli is Co-Founder and Software Engineer at Earlybyte.

Earlybyte is an IT consultancy company specialized in developing digital solutions. The main focus of Earlybyte lies in the field of robotics backend systems and IoT.

Follow me on X/Twitter to get informed on new blog posts. Furthermore, add me on LinkedIn if you’d like to interact.

Programming
Python
Setuptools
Git
Software Development
Recommended from ReadMedium