Summary

The web content provides an introduction to Redis Queue (RQ), a Python library for managing background tasks, and outlines its setup, components, and usage for processing long-running jobs efficiently.

Abstract

The article "A quick introduction to Redis Queue (RQ) to process long-running jobs" discusses the use of RQ, a Python library built on Redis, for handling background tasks in applications. It explains the need for a queue system to manage tasks like image processing and bulk email sending without blocking the main application process. The author highlights RQ's simplicity and intuitive API, making it a suitable alternative to more complex solutions like Celery. The main components of RQ, including Queue, Worker, and Job, are detailed, along with instructions on setting up Redis, installing RQ, and creating a job function. The article also demonstrates how to initialize a queue, enqueue a job, start a worker, and retrieve job status and results. The conclusion emphasizes RQ's utility for long-running tasks and hints at a future article on integrating RQ with a dockerized web app.

Opinions

The author suggests that RQ is a more straightforward and user-friendly option compared to other background task libraries like Celery.
RQ is praised for its ability to separate concerns by managing background tasks independently of the main application.
The use of a Redis server is recommended for its fast and reliable storage capabilities, which are essential for a messaging queue system like RQ.
The author expresses a preference for using Docker to run Redis, indicating ease of setup and reliability.
The article implies that using process managers like Supervisor can enhance the management of multiple RQ worker processes.
The author values the ability to create multiple queues with different priorities, suggesting that it is a useful feature for organizing tasks based on their importance or type.

A quick introduction to Redis Queue (RQ) to process long-running jobs

When building an application, there are times when you need to submit a long-running job to the backend and compute the results before returning it back to the front end. Some examples include image processing tasks, video uploads, and sending bulk emails to the recipients. From the client's perspective, these are very annoying when you have to look at the endless spinning spinners to wait for the requests to be completed, and eventually, see the requests timed out in the end.

There are multiple ways of approaching this problem, including my previous post on using GCP Cloud Tasks to manage and schedule jobs. Celery is another popular library that can be used to manage background tasks in Python applications. If you are looking at simpler and more intuitive APIs, Redis Queue (RQ) will be a good choice to perform the task.

Redis Queue (RQ)

RQ is built on top of Redis, a popular in-memory data store that provides fast, reliable storage for your data. RQ provides a messaging queue that allows you to store and retrieve messages in a timely manner, making it ideal for processing background jobs. By using a separate queue to store and manage the background tasks, this helps in separating the areas of concern, i.e. the main application and the functions that process the background jobs.

In the remaining of this article, I will show a quick and simple example of how we could enqueue a job using RQ, process the job with another function, and retrieve the status of the job.

Main components of RQ

Queue — A system that is used to accept jobs or Python functions to be invoked asynchronously by the workers
Worker — A worker is a Python process that typically runs in the background and exists solely as a workhorse to perform lengthy or blocking tasks that you don’t want to perform inside web processes.
Jobs — A job is a Python object, representing a function that is invoked asynchronously in a worker (background) process.

Prerequisites

To get started with RQ, you need to have a Redis server (≥3.0.0) running. You can either install Redis locally or use a cloud-based Redis instance. If you prefer to use Docker like me, you can run the following command:

docker run -d -p 6379:6379 redis

This will pull the latest Redis docker image and runs Redis in daemon mode with port forwarding on the default port (6379).

Once you have Redis running, you can install the RQ library using pip:

pip install rq

Create a job/task function

This can be any Python function that you would like to execute in the background. In this example, we use time.sleep to simulate a long-running job and use the random module to create arbitrary wait times and results.

# jobs.py
import time
import random

def long_running_jobs():
    jobs_running_time = random.randint(5, 10)
    time.sleep(jobs_running_time)
    return f'Jobs finished. Total run time: {jobs_running_time}'

Initializing a queue

# main.py
from rq import Queue
from redis import Redis

redis_conn = Redis()
q = Queue(connection=redis_conn)

This uses the default Redis configuration (eg: port 6379, localhost) to establish the connection to the Redis server, and create a default queue. We can also create multiple queues and name them differently, this is particularly useful when you have different types of jobs or priorities that you want to separate the logic with, eg:

high_priority_queue = Queue('high', connection=redis_conn)
low_priority_queue = Queue('low', connection=redis_conn)

Enqueue job in the queue

from jobs import long_running_jobs

job = q.enqueue(long_running_jobs)  # enqueue a long running job into the default queue

Start a worker

You can start the worker via CLI from the root of the project or a Python worker script, eg:

rq worker

You can also specify one or multiple queues in the command so that the workers can read jobs from different queues, for example in order to listen to the high and default queue:

rq worker high default

Each worker will process a single job at a time. If you want to perform jobs concurrently, we can simply start more workers from the CLI. This is where process managers like Supervisor will be very useful to manage multiple worker processes and help to restart the processes automatically in the case where errors or crashes happened.

Get status and results of the job

Once a worker is started, it will read from the the queue and start processing the jobs. You can access the job’s status via .get_status() or check if the job has finished via .is_finished from the Python job object.

while not job.is_finished:
    print('Job not finished yet, wait for 1s')
    time.sleep(1)

print(job.result)

# Output:
# Job not finished yet, wait for 1s
# Job not finished yet, wait for 1s
# Job not finished yet, wait for 1s
# Job not finished yet, wait for 1s
# Job not finished yet, wait for 1s
# Job not finished yet, wait for 1s
# Jobs finished. Total run time: 6

Conclusion

This article is meant to give a very high-level overview of Python RQ, which is a handy library for long-running processing tasks in your Python applications. It provides a simple and efficient way to queue and manage jobs and allows you to run tasks in the background while your application continues to work.

I will post a follow-up article on how we could take this concept into a dockerized web app where we can submit and get the status of the job via an API call. Thanks for reading!