avatarKaran Kaul | カラン

Summary

The article provides six less commonly known techniques to improve Python code performance, including using PyPy, caching results, faster I/O operations, faster JSON serialization, line profiling, and the bisect module.

Abstract

The author of the article shares insights into enhancing Python code performance by discussing six effective yet underexplored techniques. These methods include employing the PyPy interpreter for faster execution, implementing result caching with functools.lru_cache, optimizing I/O operations with buffering, utilizing faster JSON serialization libraries like ujson, profiling code with line-profiler for performance bottlenecks, and efficiently managing sorted lists with the bisect module. The article emphasizes the practical benefits of these techniques, such as reduced execution time and improved efficiency, and supports the claims with time comparison tests. The author encourages readers to try these methods and offers additional resources for further learning.

Opinions

  • The author values performance optimization and believes that the techniques discussed are not just theoretical but can lead to significant real-world improvements.
  • There is a preference for lesser-known methods that can give developers an edge and something new to showcase.
  • The author appreciates the Python community, suggesting readers follow them on Medium and other platforms, and encourages engagement through comments and sharing.
  • The article conveys a sense of excitement and discovery, particularly in the potential speed improvements that can be achieved with relatively simple changes.
  • The author is pragmatic, acknowledging that while the list of techniques is not exhaustive, it focuses on those that are less generic and more impactful.

After Reviewing Tons of Resources, Here is — How To Make Your Python Code Run Faster 🏃🏻💨

Some effective & unexplored tips on improving your Python code’s performance, with time comparisons.

Image by author

Introduction

Once I am done writing any piece of code, I always feel like there is room for improvement. Sometimes it can be a small UI change that makes the application more fun to use or sometimes it’s the question —

“Can I make this go any faster?”

I mostly try to do everything myself, which is not always the best thing to do. After rewriting code, re-arranging it, changing pipelines, removing redundant parts & still not seeing any drastic differences — I gave up & finally thought about researching what the internet & our “AI friends” 🤖 are suggesting.

But, I did not want some generic tips & solutions that were present in probably 100s of articles all over the internet. I wanted techniques that were not very popular but were total game-changers. This also had 2 benefits —

  • I would be working with something new that was not over-used.
  • I would get something to show off & explain to my colleagues. 😈

Therefore, after going through numerous articles, videos, even AI tools & other resources online, I compiled a list of 6 things that all of us can do today that will make our Python code run faster & I am fairly confident that you will see a difference once you try these out.

👋🏻 Before we begin, I want to thank everyone who is following me on medium. It means a lot & it really motivates me to continue writing.

Index 📄

Below is an index that will let you jump directly to any given point –

Use PyPy Interpreter 🔥

PyPy is an alternative Python interpreter that can often provide significant speed improvements for certain types of code. It includes a just-in-time (JIT) compiler, which can lead to faster execution times compared to the standard CPython interpreter. Depending on your code and the libraries you use, switching to PyPy might boost your application’s performance.

Usage

To use this interpreter, download it & add its path to environment variables(windows). In the IDE of your choice, select pypy as your interpreter before execution.

Comparison — PyPy 3.10 vs Python 3.10

The code that will be used for testing —

import time

def calculate_sum(limit):
    total = 0
    for i in range(limit):
        total += i
    return total

if __name__ == "__main__":
    limit = 10**8

    # Measure the start time
    start_time = time.time()

    # Call the function
    result = calculate_sum(limit)

    # Measure the end time
    end_time = time.time()

    # Calculate the execution time
    execution_time = end_time - start_time

    # Print the result and execution time
    print(f"The sum of numbers from 0 to {limit - 1} is: {result}")
    print(f"Execution time: {execution_time:.6f} seconds")

Execution time comparison —

# pypy with debugging
The sum of numbers from 0 to 99999999 is: 4999999950000000
Execution time: 0.185122 seconds

# python with debugging
The sum of numbers from 0 to 99999999 is: 4999999950000000
Execution time: 14.743533 seconds

# pypy without debugging
The sum of numbers from 0 to 99999999 is: 4999999950000000
Execution time: 0.110704 seconds

# python without debugging
The sum of numbers from 0 to 99999999 is: 4999999950000000
Execution time: 5.526258 seconds

Crazy results, right? 🤯 Try it out yourself & see how much time you can save in your projects. Leave a comment if this tip was helpful!

Auto Cache Results 💿

Implement caching using tools like functools.lru_cache to store the results of expensive function calls, reducing the need for recomputation.

You can cache the results yourself but this tool takes care of that for you in a very optimized manner.

Usage

The function defined below is decorated with functools.lru_cache(maxsize=128), which means that it will cache up to 128 most recently called results. If a function is called with the same arguments again, it will return the cached result instead of recomputing it.

Comparison — Cached vs Non Cached code

The code that will be used for testing —

import functools

# Decorate the function with lru_cache
@functools.lru_cache(maxsize=128)
def fibonacci(n):
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

if __name__ == "__main__":
    num_terms = 10

    # Calculate and print the first 'num_terms' Fibonacci numbers
    for i in range(num_terms):
        fib_number = fibonacci(i)
        print(f"Fibonacci({i}) = {fib_number}")

For the time comparison, I will compare the cached(lru_cache) & non-cached execution times —

# without the lru_cache decorator
Fibonacci(29) = 514229
Execution time: 0.579581 seconds

# with the lru_cache decorator
Fibonacci(29) = 514229
Execution time: 0.005953 seconds

Faster I/O Operations 🤩

When reading or writing large files, use buffered I/O for faster performance. Buffered I/O is generally faster because it reads data in larger chunks from the file, reducing the number of system calls and improving overall efficiency.

Usage

We will try to read a large text file(~1,20,16,177 words) using 2 functions, one will be buffered and the other will use a simple read function.

Comparison — Buffered vs. Non-Buffered Read

The code that will be used for testing & the time comparison —

import time

def read_large_file_buffered(filename):
    buffer_size = 4096  # Choose a buffer size (adjust if necessary)
    with open(filename, 'rb', buffering=buffer_size) as file:
        while True:
            data = file.read(buffer_size)
            if not data:
                break

def read_large_file_unbuffered(filename):
    with open(filename, 'rb') as file:
        while True:
            data = file.read(1)
            if not data:
                break

if __name__ == "__main__":
    filename = "text.txt"
    
    # Measure time for buffered I/O
    start_time = time.time()
    read_large_file_buffered(filename)
    end_time = time.time()
    buffered_time = end_time - start_time

    # Measure time for unbuffered I/O
    start_time = time.time()
    read_large_file_unbuffered(filename)
    end_time = time.time()
    unbuffered_time = end_time - start_time

    print(f"Time taken with buffered I/O: {buffered_time:.5f} seconds")
    print(f"Time taken with unbuffered I/O: {unbuffered_time:.5f} seconds")        
Time taken with buffered I/O: 0.04972 seconds
Time taken with unbuffered I/O: 3.24232 seconds

A significant time difference! 🏋🏻‍♂️

Faster JSON Serialization 💨

When dealing with JSON data, consider using ujson or orjson instead of the standard json module. These libraries are much faster and can significantly improve serialization and deserialization times.

Usage

Install ‘ujson’ using pip install ujson

The json file I used contained ~3500 keys, which I would say is moderately big. The file originally had half of the keys, which was still the biggest json that I had to work with at my job.

Comparison — ujson vs json

The code that will be used for testing & the time comparison —

import json
import ujson
import time

def load_json_with_json_module(filename):
    with open(filename, 'r') as file:
        data = json.load(file)

def load_json_with_ujson_module(filename):
    with open(filename, 'r') as file:
        data = ujson.load(file)

if __name__ == "__main__":
    filename = "data.json"  # Replace with your JSON file's path

    # Measure time for json module
    start_time = time.time()
    load_json_with_json_module(filename)
    end_time = time.time()
    json_time = end_time - start_time

    # Measure time for ujson module
    start_time = time.time()
    load_json_with_ujson_module(filename)
    end_time = time.time()
    ujson_time = end_time - start_time

    print(f"Time taken with json module: {json_time:.5f} seconds")
    print(f"Time taken with ujson module: {ujson_time:.5f} seconds")
Time taken with json module: 0.07693 seconds
Time taken with ujson module: 0.05958 seconds

The difference is not as big as we saw in the points above but you can imagine how it will scale with bigger JSON files. 😅

Line Profiler 📈

Line Profiler is a Python package that allows you to profile and analyse the time it takes to execute different lines of your code. Using this tool, you can check which lines in your code are taking the longest to run & also how many times they were called. Another neat thing it tells you is the % of the total time each line took to execute.

Usage

Install the module using pip install line-profiler

Code that will be profiled using line-profiler —

import line_profiler

def myFunction():
    test = []
    for _ in range(10000):
        test.append("*")
    
    print("Done!")
    for index, item in enumerate(test[:10]):
        print(str(index) + item)

def main():
    lp = line_profiler.LineProfiler()
    lp.add_function(myFunction)
    
    lp_wrapper = lp(myFunction)
    lp_wrapper()
    
    lp.print_stats()

# profile_slow_function.py
if __name__ == "__main__":
    main()

The output shows us a line-by-line breakdown of the code —

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     3                                           def myFunction():
     4         1       1000.0   1000.0      0.0      test = []
     5     10000    1309000.0    130.9     44.3      for _ in range(10000):
     6     10000    1486000.0    148.6     50.2          test.append("*")
     7                                               
     8         1      44000.0  44000.0      1.5      print("Done!")
     9        10       8000.0    800.0      0.3      for index, item in enumerate(test[:10]):
    10        10     110000.0  11000.0      3.7          print(str(index) + item)

Bisect Module ✂️

We can utilise the ‘bisect’ module to perform efficient binary searches or insertions into sorted lists.

Usage

The bisect module is included with Python so we can directly import & start using it.

Comparison — Insertion with Bisect vs. Manual Insertion

Here is the code & the time comparison —

import bisect
import time

def manual_insert(sorted_list, value):
    for i, item in enumerate(sorted_list):
        if item >= value:
            sorted_list.insert(i, value)
            return
    sorted_list.append(value)

def compare_bisect_vs_manual():
    # Sorted list to work with
    sorted_list = [i for i in range(1,100000000)]

    # Element to insert
    value = 47488593

    # Using bisect
    start_time_bisect = time.time()
    bisect.insort_left(sorted_list, value)
    end_time_bisect = time.time()

    # print("Sorted list using bisect:", sorted_list)
    print("Time taken with bisect:", round(end_time_bisect - start_time_bisect,5), "seconds")

    # Reset the sorted list
    sorted_list = [i for i in range(1,100000000)]

    # Using manual_insert
    start_time_manual = time.time()
    manual_insert(sorted_list, value)
    end_time_manual = time.time()

    # print("Sorted list using manual_insert:", sorted_list)
    print("Time taken with manual_insert:", round(end_time_manual - start_time_manual,5), "seconds")

if __name__ == "__main__":
    compare_bisect_vs_manual()
Time taken with bisect: 0.10074 seconds
Time taken with manual_insert: 1.74581 seconds

Another significant time difference!

That is all 🙌

This list is not exhaustive & 6 might be a small number of techniques to compose an article around, but the other tips & techniques I came across were very generic & there were already 100s of articles on them.

Try these techniques yourself & let me know if they helped. If you are stuck somewhere or confused about anything, you can post your queries in the comments & I will try to help you as much as I can.

Thanks for reading! 🖤 Clap, Comment & Share maybe? (you can drop 50 claps…)

Check out these other cool articles —

In Plain English

Thank you for being a part of our community! Before you go:

Software Engineering
Python
Coding
Programming
Code Optimization
Recommended from ReadMedium