The article discusses the optimization of integer addition in Python using bitwise operations and C++ extension modules to achieve performance comparable to native Python addition.
Abstract
The article delves into the concept of improving Python's performance for integer addition by employing bitwise operations. The author presents two methods: a slow lookup addition and a fast bitshift addition. The slow method involves a naive approach to addition using memoization and lookup arrays for remainders and carryovers. The fast method leverages bitwise operators, which simplifies the addition process by handling binary representations of numbers, thus avoiding the need for lookup arrays. The author then demonstrates how integrating C++ extension modules can further enhance performance, nearly matching the speed of Python's built-in addition operator. The article concludes with experimental results showcasing the efficiency gains from using bitwise operations and C++, particularly for large numbers and a high volume of additions.
Opinions
The author believes that a deep understanding of bytes and bits is crucial for algorithm optimization.
They suggest that using bitwise operations can significantly improve the performance of arithmetic operations in Python.
The author posits that while Python is versatile, its performance for certain operations can be greatly enhanced by integrating with lower-level languages like C++.
The article implies that the use of C++ extension modules is a practical approach to achieving high performance in Python applications.
The author's experiments indicate that the bitshift addition method, when implemented in C++, can perform additions almost as quickly as Python's native addition operator.
Using Bitwise Operations to Improve Python Performance
I’ll walk through an algorithm to do addition using Python. Using bitwise operators and C++ extension modules results in nearly on-par performance with base Python addition.
Background
I’ve recently been dealing with bytes and bits a lot. I thought I’d revisit some of the old algorithm problems I remember seeing back in the day. One classic problem is this:
Create an integer addition ‘+’ operator. Assume you can not use multiplication, division, or subtraction. For simplicity, assume inputs can only be non-negative integers.
Let’s first think of a simple naive solution to this. One way to do this would be to replicate long addition but optimize it using some simple memoization.
Slow Method: Lookup Addition
For long addition, you align the two numbers you want to add. Starting on the far right (first digit), you keep adding and carrying over terms if the two terms you’ve added are greater than 9.
We describe the algorithm in pseudo-code, then walk through a simple example.
Pseudo Code
We will explain more in-depth what this pseudo-code means using an example.
Lookup Addition Pseudo Code
Walking through the pseudo-code
Let's take an example for adding the two numbers 23 and 48. Think of these two numbers as two arrays of strings:
[‘2’, ‘3’] and [‘4’, ‘8’]
We align the numbers, then go right to left. At each position, we will only be adding two single digit numbers (such as 3+8 in position 1, and 2+4 in position 0).
We will need to determine two things:
Remainder term
Carryover term
We discuss what they mean in this example.
Position 1
Bring forward remainder term from prior step: (2+1, 4)
Remainder Step: remainder(3, 4) = 7
Carryover Step: carryover(3,4) = 0
Add remainder step to output array: [7, 1]
Final Step
Return concatenated array: 71
What we see is that in position 1, the remainder step gives us the digit in the output, whereas the carry-over step will increment the digits in the next position (i.e. position 0). We repeat this process, appending an output array using the remainder step until we have exhausted both arrays.
Python Code
Now for the python code. Unfortunately, it’s very complex. We won’t spend time refactoring this, because another solution is both more readable and faster.
Simple speed ups
We can memoize the remainder and carryover steps, by defining lookup arrays: Remainder and Carryover. We won’t delve any deeper into this, but it was the main speed improvement for this algorithm.
Remainder Array
Carryover Array
Fast Method: Bitshift Addition
Another avenue for a solution is using bitshift operators. Bit shifting is equivalent to multiplying or dividing a number by some base number. For example, in base-2 the number 4 is represented as [1,0,0]. Remember, bits are just coefficients, so 4=1*2²+0*2¹+0*2⁰.
Bit shifting 4 to the left is equivalent to multiplying 4 by the number 2, thus BitshiftLeft(4)=8=[1,0,0,0].
Pseudo Code
Below we have our pseudo-code for a bitwise-based algorithm. You notice right off the bat that bitshift addition does not rely on lookup arrays, and its pseudo-code is much much shorter. It turns out its python/c++ code is also very very short.
Walking through the pseudo-code
I'll walk through the logic of how this works. Let’s do a simpler addition than before, assuming we want to add 5+3.
Let’s covert 5, 3, and 8 (the answer) into base-2 binary representation:
5=1*2² +0*2¹+1*2⁰
3=0*2² +1*2¹+1*2⁰
8=1*2³+0*2²+0*2¹ +0*2⁰
What we will do is basically replicate our carry-over logic from above, but this time in binary. This allows us to do the carry-over in a “vectorized” fashion.
The approach attempts to find two-bit sets such that they have no overlapping bits whose values are greater than 0. To do so, we alternate using bitshifts and bitmasks. The algorithm's base case defines x, y as the input values we're trying to add together:
x=[1, 0, 1] (i.e. 5 in base-2)
y=[0,1,1] (i.e. 3 in base-2)
Iteration 1
Step 1: Carryover step using AND and Bitshift bitwise operator
This step simply looks for bits shared between 3 and 5, and then carries them forward in the next step. We combine AND and bitshift left to redefine x:
x=BitshiftLeft(And(x,y)) = [0,1,0]
Breaking these steps down:
And(x,y) = [0, 0, 1] (because 5 and 3 share 2⁰)
BitshiftLeft([0,0,1]) = [0, 1, 0] (this is equivalent to 2¹=2⁰+2⁰)
Step 2: Prevent doubling counting step using OR and Bitmask bitwise operator
In step 1, we eliminated bits that were shared between x and y. We want to remove those bits from y.
y=BitMask(Or(x,y), ~And(x,y))=[1,1,0]
Breaking these steps down:
~And(x,y) = [1, 1, 0] (these are bits we did not eliminate in step 1, we want to keep)
Or(x,y) = [1, 1, 1] (these are bits we way need to mask)
BitMask([1, 1, 1], [1, 1, 0]) = [1, 1, 0] (mask the bits in 2⁰ position)
Stop Condition
We keep doing these iterations until x and y do not share any bits, i.e. we can simply add together the bit representation. In this example, this is the final values of x and y:
Final x=[1, 0, 0, 0]
Final y=[0, 0, 0, 0]
We use the OR operator to get our solution:
8 = Or(x, y) = [1, 0, 0, 0]
Python Code
The python code is relatively short and straightforward.
Simple Speedups — C++ Extension Module
One “easy” speed up is to code this up in C++ and create a C++ extension module. The code for the extension module is below. The actual C++ function we will call in python is defined as c_bitadd_method, the rest of the code is to define c++ classes/methods so it can integrate with Python.
Results
I do a simple experiment, varying two parameters:
Total Number of Samples: use three values: 100k, 1mil, 10mil
Largest Possible Number: use three values: 100k, 10mil, 1bil
These two parameters vary the total number of additions being done, as well as, the size of the integers. For example, when Largest Possible Number = 1bil this means that potentially the add function will need to add together integers as large as 1 billion.
I test the 4 algorithms:
Python Base Add Operator
Lookup Addition (LookupAdd)
Bitshift Addition — Python (PyBitshiftAdd)
Bitshift Addition — C++ (CPyBitshiftAdd)
The results are below and fairly clear:
Lookup Addition is really slow: this is expected
Bitshift Addition compiled in C++ performs very close to the Python Base Add operation, which is pleasant to see given this was the first version:
Conclusion
This post was a simple walk-through of using bitwise operations and C++ to create an additio function that has on-par performance with Python’s addition operators.
Want toConnect?
I discuss on similar topics onmy personal website.