avatarShashank Iyer

Summary

The web content provides a guide on how to compile Python applications using Nuitka, with a focus on managing editable installs and protecting proprietary code during the compilation process within a Docker environment.

Abstract

The article titled "Software Engineering | Python Compilation" offers a concise three-step approach to compiling Python applications using Nuitka, emphasizing the protection of proprietary code. It begins by acknowledging the popularity of Pipenv for managing dependencies and then transitions into the rationale for compiling Python code to safeguard intellectual property. The author recommends installing Nuitka with the Python interpreter to avoid conflicts with other dependencies managed by Pipenv. The article outlines an example project structure that includes both open-source and internal dependencies, and it advises on using Pipfile for managing these dependencies effectively. The author advocates for the use of Docker to compile the source code selectively, excluding open-source libraries to streamline the process and avoid potential errors. The compilation is divided into three Docker stages: compiling source code and private modules, extracting open-source dependencies, and creating the runtime environment. This approach aims to balance the need for code security with efficient development practices.

Opinions

  • The author expresses a preference for using Nuitka to compile Python code as a means to hide proprietary source code from potential reverse engineering.
  • It is suggested that Nuitka should be installed at build time with pip and not included in the Pipfile to prevent interference with other dependencies installed by Pipenv.
  • The author's approach to project structure involves using git submodules for internal dependencies rather than specifying URLs in the Pipfile to avoid permission issues during Docker builds.
  • The use of a Docker multi-stage build process is recommended for separating the compilation of proprietary code from the installation of open-source dependencies, which is seen as beneficial for faster iterations and error avoidance.
  • The author values the community and encourages readers to engage by clapping, following, and providing feedback or suggestions for improving the compilation process.

Software Engineering | Python Compilation

3 Simple Steps To Compile Your Python Application Using Nuitka

Elegantly manage editable installs at compile time, while continuing to enjoy the conveniences of Pipenv

Photo by Aziz Acharki on Unsplash

Pipenv continues to gain popularity as a Python dependency management tool.

It is convenient to use and eliminates the need to manually create a venv. With its added functionality of installing modules in editable mode, it minimizes the need for distributable artifacts. This and many other cool features simplify package installation and enable data scientists, and software engineers to allocate more time to their core responsibilities.

As a regular user of pipenv, I am generally happy with its approach to resolving multi-level dependencies and accurately setting up a virtualenv.

Laying the Groundwork for Compilation

All of us have at least one savvy customer who has one or more teams who stealthily work on reverse-engineering products.

Better to hide as much of the source code as possible, right?

So if your application has both open-source and internal dependencies, you need an effective strategy to hide just enough of the application’s source code.

…why go all out and compile what’s already available to everyone?

If you’re looking to use Nuitka to compile your code, this article should serve as a good reference.

Basic Installation

The documentation recommends installing Nuitka with the following command:

<the_right_python> -m nuitka 

to be absolutely certain which Python interpreter you are using, so it is easier to match with what Nuitka has.

In my experience, it’s best to install Nuitka at build time with pip and NOT include it in your Pipfile. Nuitka seems to mess with other dependencies installed by pipenv.

Example Project Structure

Let’s suppose your application is laid out as follows:

your_app
|
|__lib/
|__utils/
|__src/
|  |__a.py
|  |__entry_point.py
|
|__internal_modules/private_module
|                  |__setup.py
|                  |__algos/*.py
|                  |__protobufs/*.py
|
|__Pipfile
|__Pipfile.lock
|__setup.py
|__pyproject.toml

Where private_module is your internal git repository that contains platform software interfaces or proprietary algorithms used by your group.

If private_module isn’t installed using a development artifact (like a wheel file) but instead as an editable install, yourPipfile may have the following entries (the second being added by the setup.py in your_app’s root directory. Having a setup.py at the root level will enable pipenv to locate lib, utils, and src. This way you can control additions to sys.path from your setup script):

private-module = {editable=True, path="./internal_modules/private_module"}
your-app = {editable=True, path="."}

In the previous example, I have shown the submodule paths as relative to the root project. This is because my preference is always to include all required code as a git submodule instead of specifying URLs in the Pipfile. It avoids having to deal with permission issues during the docker build process.

If you’d rather have pipenv download your internal modules, check out this article for a deep dive into how you may achieve that from inside a container.

And if you need a refresher on how to use a git submodule with pipenv, this article should serve as a good reference:

Compiling Before Shipping

My preference for compiling with Nuitka is to do it in 3 steps:

  1. Select an appropriate Docker image (and compile your src code during the docker build stage)
  2. Compile your source code and all private_modules
  3. Install open-source dependencies in your docker runtime.

Hide The High-Value Stuff

Let’s assume we have selected python-buster:3.9 as our base image of choice. This is a debian image that is shipped with an installed python REPL.

Using our example project structure from above, the Dockerfile can be laid out as follows:

FROM python:3.9-buster as compile-stage

WORKDIR /Application

COPY . .

RUN pip install nuitka==<version>
ENV PYTHONPATH=$PYTHONPATH:/Application/internal_modules/private_modules

RUN python -m nuitka \
  --include-package=lib \
  --include-package=utils \
  --include-package=src \
  --include-package=algos \
  --include-package=protobufs \
  --outfile=app.bin
  src/entry_point.py

Extract Open-Source Dependencies

This can be done as part of the second stage of the docker build process.

Note: all modules installed as editables by pipenv should be excluded from the generated requirements file in this build stage.

FROM python:3.9-buster as dependency-builder

WORKDIR /Application

COPY . .

RUN pip install pipenv==<version>
RUN pipenv sync # accurately recreate the virtualenv
RUN pipenv run pip freeze --exclude private-modules --exclude your-app > requirements.txt

Create the Runtime

FROM python:3.9-buster as runtime

WORKDIR /Application

COPY --from=compile-stage /Application/app.bin .
COPY --from=dependency-builder /Application/requirements.txt .

RUN pip install -r requirements.txt # install open source requirements specified by Pipfile.lock in your_app/
CMD ["./app.bin"]

Building a docker image with this 3-stage Dockerfile will compile only your i/p and not open-source libraries.

This will make your compilation stage run faster, enabling quicker iterations, and also prevent errors that may be caused while compiling complex libraries like scikit-learn , or pandas.

This is my preferred approach for compiling application python code. If you have suggestions on how to do this more elegantly, let me know in the comments section.

In Plain English

Thank you for being a part of our community! Before you go:

Python
Python Programming
Nuitka
Python3
Pipenv
Recommended from ReadMedium