avatarDerek Mortensen

Summary

This article, part of a series on graphical exploration of concurrency in Python, compares the performance of sequential, multiprocessing, threading, and async methods in retrieving 1334 Jira issues from the Jira API in 27 separate requests.

Abstract

The article discusses the use of the altair plotting library to generate charts showing the relative start and end times of each individual request in retrieving 1334 Jira issues from the Jira API. The code for generating these charts is provided, along with charts showing the total duration of each request and the sum of the duration of all 27 requests. The article emphasizes the importance of being able to wait in parallel, which results in significant time savings compared to sequential requests. The code used in the article is available on GitHub, and references to other resources on the topic are provided.

Bullet points

  • The article is part of a series on graphical exploration of concurrency in Python.
  • The example used is retrieving 1334 Jira issues in batches of 50 from the Jira API.
  • The altair plotting library was used to generate charts showing the relative start and end times of each individual request.
  • The total duration of each request is relatively constant across all examples, but the difference is that waiting can be done in parallel.
  • The code used in the article is available on GitHub.
  • References to other resources on the topic are provided.

Graphical exploration of concurrency in Python

Part 5: API request timing comparison — Sequential, Multiprocessing, Threading, and Async

A series exploring concurrency in Python to speed up data retrieval from remote APIs.

This article is part of a series: Graphical exploration of concurrency in Python. Part 1: Defining and timing an API function in Python.

In this series, we’ve used the example of retrieving 1334 Jira issues in batches of 50 from the Jira API to illustrate 4 different ways of coordinating the 27 separate network requests required.

The altair plotting library was used throughout to generate charts showing the relative start and end times of each individual request (a link to the full code is available at the bottom of this article). Here’s the code to generate a chart of the relative time required to retrieve all 1334 issues in 27 requests after running each of the examples.

This is the “wall time” of how long it took to retrieve 1334 Jira issues in 27 separate requests. Note the big gain from Sequential to any concurrency and then more moderate gains as we move up the layers of abstraction.

The important thing to remember is that the total duration of each request is relatively constant across all these examples. The difference is that we were able to wait in parallel.

This is the sum of the duration of all 27 requests. (This is not how much time elapsed on a clock on the wall)

Code Available Here

https://gist.github.com/dmort-ca/eee316bac7cf675accc370971623e262

References

- https://www.nagekar.com/2020/05/python-io-benchmarks.html - https://www.nagekar.com/2016/02/callbacks-eventloop-in-javascript-explained.html - https://blog.jupyter.org/ipython-7-0-async-repl-a35ce050f7f7 - https://stackoverflow.com/questions/49466794/why-does-concurrent-futures-executor-map-throw-error-when-using-with-futures-as - https://idolstarastronomer.com/two-futures.html - https://www.reddit.com/r/learnpython/comments/72a8ek/why_bother_using_asyncio_when/ - https://www.artificialworlds.net/blog/2019/02/26/python-async-basics-video-100-million-http-requests/ - https://altair-viz.github.io/gallery/diverging_stacked_bar_chart.html

Series Map

Python
Concurrent Programming
API
Api Concurrency Python
Recommended from ReadMedium