avatarTomas Svojanovsky

Summarize

Speed up Your Blocking I/O Code with Multithreading

A large portion of our work may be managing existing code using blocking I/O libraries, such as requests for HTTP requests, psycopg for Postgres databases, or any number of blocking libraries.

The intriguing aspect of blocking I/O operations in Python is that they release the GIL during their execution. This behavior enables the possibility of running I/O operations concurrently in separate threads, effectively harnessing the power of parallelism within a Python application. By leveraging threading, developers can design applications that efficiently handle multiple I/O operations simultaneously, enhancing performance and responsiveness.

You say we can do multithreading?

The Python interpreter operates in a single-threaded manner within a process, which means that only one piece of Python bytecode can be executed at any given time, even if multiple threads are active. This constraint is enforced by the Global Interpreter Lock (GIL), which permits only one thread to execute Python bytecode at a time.

While this arrangement may appear restrictive for maximizing the benefits of multithreading, there are specific scenarios in which the GIL is temporarily released. One such scenario occurs during I/O (input/output) operations. Python relinquishes the GIL during I/O operations because, at a low level, the language relies on operating system calls to carry out these tasks. These system calls operate independently of the Python interpreter, meaning that Python bytecode does not need to be executed while waiting for I/O operations to finish. Consequently, during these periods, other threads can continue executing Python bytecode concurrently, facilitating simultaneous I/O operations despite the presence of the GIL.

Sockets…

Recently I was talking about sockets. Imagine you have a project where non blocking sockets are not option. What now?

If you find yourself in a project where non-blocking sockets are not feasible, there’s still a viable solution. Since the recv and sendall methods of sockets are I/O-bound and therefore release the Global Interpreter Lock (GIL), it's possible to execute them concurrently in separate threads.

This approach involves creating one thread per connected client, where each thread is responsible for handling the read and write operations of its associated client. This model, known as the thread-per-connection model, is commonly used in web servers like Apache.

This model is a common paradigm in web servers such as Apache and is known as a thread-per-connection model.

from threading import Thread
import socket

# Function to handle receiving and echoing data from a client
def receiver(client_socket: socket):
    while True:
        # Receive data from the client
        data = client_socket.recv(2048)
        # Print the received data
        print(f"Data received: {data}")
        # Echo the received data back to the client
        client_socket.sendall(data)

# Create a TCP/IP socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as server:
    # Allow the socket to be reused immediately after it is closed
    server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    # Bind the socket to the address and port
    server.bind(("127.0.0.1", 8000))
    # Listen for incoming connections
    server.listen()

    while True:
        # Accept a new connection
        connection, _ = server.accept()
        # Create a new thread for each client connection
        thread = Thread(target=receiver, args=(connection,))
        # Start the thread
        thread.start()

This solves the problem of multiple clients being unable to connect at the same time with blocking sockets, although the approach has some issues unique to threads. What happens if we try to kill this process with CTRL-C while we have clients connected?

However, shutting down the application is not as straightforward as it may seem. If you attempt to terminate the application, you’ll likely encounter a KeyboardInterrupt exception thrown on the server.accept() call. Despite this exception, the application won't exit smoothly due to the presence of a background thread that keeps the program alive. As a result, any connected clients will still be able to send and receive messages, leading to potential issues.

One important thing to note is that user-created threads in Python do not receive KeyboardInterrupt exceptions; only the main thread is interrupted by such exceptions. Consequently, our threads will continue to run uninterrupted, reading data from clients and preventing the application from exiting gracefully.

Solution?

When it comes to handling thread termination in Python, developers have a couple of approaches at their disposal. One common method is to utilize what are known as daemon threads, or alternatively, to implement a custom approach for canceling or interrupting running threads.

Daemon threads serve as a specialized type of thread designed for long-running background tasks. Unlike regular threads, daemon threads do not prevent the application from shutting down. In fact, when only daemon threads are active, the application will automatically terminate. It’s worth noting that Python’s main thread is not a daemon thread by default.

Creating daemon threads is straightforward. Developers simply need to set the daemon attribute to True before starting the thread using the start() method. However, one significant drawback of this approach is the lack of control over thread termination. Since daemon threads terminate abruptly when the main program exits, there's no opportunity to execute any cleanup or shutdown logic.

While daemon threads offer simplicity and ease of implementation, they may not be suitable for scenarios where precise cleanup or resource management is required. In such cases, developers may opt to design their own thread termination mechanism, allowing for more fine-grained control over the shutdown process.

The second approach

To do this, we’ll create threads slightly differently than before, by subclassing the Thread class itself. This will let us define our own thread with a cancel method, inside of which we can shut down the client socket. Then, our calls to recv and sendall will be interrupted, allowing us to exit our while loop and close out the thread.

from threading import Thread
import socket

class ClientEchoThread(Thread):
    def __init__(self, client):
        super().__init__()
        self.client = client

    def run(self):
        try:
            # Continuously receive data from the client
            while True:
                data = self.client.recv(2048)
                # If no data is received, the connection is closed
                if not data:
                    raise BrokenPipeError("Connection closed!")
                # Print the received data
                print(f"Data received {data}!")
                # Send the received data back to the client
                self.client.sendall(data)

        except OSError as e:
            # Handle any OSError (e.g., connection reset by peer)
            print(f"Thread interrupted by {e} exception, shutting down!")

    def close(self):
        # Check if the thread is still running
        if self.is_alive():
            # Send a message to the client indicating the shutdown
            self.client.sendall(bytes("Shutting down!", encoding="utf-8"))
            # Shutdown the client connection for both reads and writes
            self.client.shutdown(socket.SHUT_RDWR)

# Create a TCP/IP socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as server:
    # Allow the socket to be reused immediately after it is closed
    server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    # Bind the socket to the address and port
    server.bind(("127.0.0.1", 8000))
    # Listen for incoming connections
    server.listen()
    # List to store connection threads
    connection_threads = []

    try:
        while True:
            # Accept a new connection
            connection, addr = server.accept()
            # Create a new thread for each client connection
            thread = ClientEchoThread(connection)
            # Add the thread to the list
            connection_threads.append(thread)
            # Start the thread
            thread.start()
    except KeyboardInterrupt:
        # Handle KeyboardInterrupt (Ctrl+C)
        print("Shutting down!")
        # Close all connection threa

Overall, canceling running threads in Python, and in general, is a tricky problem and depends on the specific shutdown case you’re trying to handle. You’ll need to take special care that your threads do not block your application from exiting and to figure out where to put in appropriate interrupt points to exit your threads.

If you enjoyed the read and want to be part of our growing community, hit the follow button, and let’s embark on a knowledge journey together.

Your feedback and comments are always welcome, so don’t hold back!

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

Python
Multithreading
Programming
Software Development
Software Engineering
Recommended from ReadMedium