avatarArun Suresh Kumar

Summary

The provided content is a comprehensive tutorial on implementing API caching in Python using the requests and requests_cache libraries to enhance application performance by reducing repetitive API calls.

Abstract

The article "Mastering API Caching in Python: A Step-by-Step Guide" offers a detailed walkthrough for developers looking to optimize their applications by caching API responses. It begins by explaining the concept of caching and its importance in improving application performance by storing temporary data to avoid unnecessary API requests. The guide covers setting up the necessary Python libraries, demonstrating how to fetch and cache data, customizing cache settings, handling errors, and changing cache storage backends. It also delves into advanced features such as cache filters, custom cache keys, and monitoring cache performance. The tutorial is designed to be beginner-friendly, ensuring that readers of all levels can grasp the concepts and best practices of API caching in Python.

Opinions

  • The author emphasizes the beginner-friendliness of the guide, ensuring clear instructions and in-depth coverage for readers at all levels.
  • Caching is presented as an essential technique for reducing API requests, saving time, and improving performance, especially for static responses.
  • The use of requests_cache is advocated for its ability to wrap all features of requests and provide additional caching functionality.
  • The author suggests that debugging is an integral part of working with caching and provides methods to inspect cached URLs and data.
  • Customization of cache settings and the use of advanced features like cache filters and custom cache keys are encouraged to create more fine-grained caching rules.
  • Monitoring cache performance is highlighted as a crucial step to ensure the effectiveness of the caching implementation.

Mastering API Caching in Python: A Step-by-Step Guide

Effortlessly Improve Your App’s Performance — A 13 step tutorial to caching external APIs in Python while using ‘requests’ and ‘requests_cache’ library.

Photo by Shane on Unsplash

My plan for today is to lead you through the benefits of API caching and how to implement it using Python’s requests and requests-cache libraries. I have ensured that this article is suitable for all levels, providing clear instructions and in-depth coverage, ensuring you get a solid foundation in API caching and its best practices.

Introduction

API (Application Programming Interface) requests are essential for communicating with external services and fetching data. However, sending too many requests can slow down your application and potentially exceed API limits. This is avoidable if these are same request with same static responses. Here comes Caching, a technique to store and reuse API responses, reducing the need for repetitive requests and improving performance. In this beginner-friendly guide I wanted to introduce you to caching external API requests in Python using ‘requests’ library.

Understanding Caching

Caching is storing data temporarily so it can be quickly accessed and reused without making additional requests to the API. This helps reduce the number of requests, saves time and improves performance.

Step 1: Let’s set it up

Before getting started, you need to install a few necessary tools and libraries. You should already be having Python 3 and pip, the Python package manager. It’s advisable to use virtualenv. Then, install the requests and requests-cache libraries using the following command:

pip install requests requests-cache

Step 2: Let’s Fetch the data

First, let’s see how to fetch data without caching. You can use the requests library to send API requests in Python. Let’s see how to do that:

import requests

url = "https://dummyjson.com/products"
s = requests.Session()
r = s.get(url)
print(r.json())

Step 3: Cache it!

Now, let’s add caching to the same example. You’ll need to import the requests_cache library and set up a cache with a specific duration. Here's how to do it:

import requests
import requests_cache

# Set up a cache that lasts for 5 minutes
requests_cache.install_cache("my_cache", expire_after=300)

# Clear cache just to be safe
requests_cache.clear()

response = requests.get('https://httpbin.org/json')
from_cache = getattr(response, 'from_cache', False)
print(f'Is request cached: {from_cache}')  # False

response = requests.get('https://httpbin.org/json')
from_cache = getattr(response, 'from_cache', False)
print(f'Is request cached: {from_cache}')  # True

When you run this code, the API response will be cached for 5 minutes. Since we make another request within this timeframe, the cached data will be used instead of sending a new request.

Now, requests_cache does wrap all features of requests and can be used without explicitly using requests module. We will use that:

from requests_cache import CachedSession

session = CachedSession('my_cache')

# Clear cache just to be safe
session.cache.clear()
print('Cached URLS:')
print('\n'.join(session.cache.urls()))

response = session.get('https://httpbin.org/get')

print('Cached URLS:')
print('\n'.join(session.cache.urls()))

Step 4: Customize the Cache

You can customize your cache settings by modifying the expire_after parameter. For example, you can set it to a longer duration, like 24 hours:

from requests_cache import CachedSession

session = CachedSession('my_cache', expire_after=86400) 

Step 5: Clear or bypass the cache

Sometimes, you may need to clear the cache or bypass it for a particular request.

To clear the entire cache, you must have noticed what I did in the example above:

from requests_cache import CachedSession

session = CachedSession('my_cache')

# Clear cache just to be safe
session.cache.clear()

To bypass the cache for a specific request, you can use the cache_disabled() :

from requests_cache import CachedSession

session = CachedSession('my_cache', expire_after=86400)

# Clear cache just to be safe
session.cache.clear()
print('Cached URLS:')
print('\n'.join(session.cache.urls()))

with session.cache_disabled():
    response = session.get('https://httpbin.org/get')

print('Cached URLS:')
print('\n'.join(session.cache.urls()))

Step 6: Handle the errors

When working with caching, it’s essential to handle errors that might occur when the cache expires or is unavailable. You can use a try-except block to handle such cases. This might not work in newer versions of requests_cache. In those cases fallback to native exception objects.

import requests
import requests_cache

requests_cache.install_cache("my_cache", expire_after=300)

url = "https://dummyjson.com/products"

try:
    response = requests.get(url)
    print(response.json())
except requests_cache.exceptions.RequestsCacheError:
    print("Error: Cache unavailable or expired.")

Step 7: Let’s change Cache Storage Backends

By default, requests-cache uses SQLite for storing cache data. However, you can switch to other storage backends like Redis or in-memory storage. Here's an example of using Redis:

Install the redis library:

pip install redis

Steps to install redis in your system can he found here. Once installed, let’s set Redis as your cache storage in ourcode:

from requests_cache import CachedSession, RedisCache

backend = RedisCache(host='192.168.1.63', port=6379) #redis IP and port
session = CachedSession('my_cache', backend=backend)

# Clear cache just to be safe
session.cache.clear()

print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will not print anything as we cleared cache

response = session.get('https://httpbin.org/get')

print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will print cached URLs

Step 8: Let’s debug

When working with caching, you may want to know if APIs were cached and if yes, what are the cached APIs. You can use cache.urls():

from requests_cache import CachedSession

session = CachedSession('my_cache')

# Clear cache just to be safe
session.cache.clear()

print('Cached URLS:')
print('\n'.join(session.cache.urls()))  # will not print anything as we cleared cache

response1 = session.get('https://httpbin.org/get')
response2 = session.get('https://httpbin.org/anything')
response3 = session.get('https://httpbin.org/base64/SFRUUEJJTiBpcyBhd2Vzb21l')

print('Cached URLS:')
print('\n'.join(session.cache.urls()))  # will print cached URLs

Step 9: Let’s cache API Requests with Varying Parameters

In some cases, you might want to cache API requests with different parameters. requests-cache can handle this automatically. For example, if you have an API that accepts a query parameter, such as https://api.example.com/data?id=1, the cache will store responses

from requests_cache import CachedSession

session = CachedSession('my_cache')

# Clear cache just to be safe
session.cache.clear()

print('Cached URLS:')
print('\n'.join(session.cache.urls()))  # will not print anything as we cleared cache

response1 = session.get('https://httpbin.org/anything', params={"id": 15, "code": "ABC"})
response2 = session.get('https://httpbin.org/anything', params={"id": 191, "code": "$RF"})
response3 = session.get('https://httpbin.org/anything', params={"id": 00, "code": "SDH"})

print('Cached URLS:')
print('\n'.join(session.cache.urls()))  # will print cached URLs

print('\nCached data:')
for response in session.cache.filter():
    print(response)
Cached URLS:
https://httpbin.org/anything?code=$RF&id=191
https://httpbin.org/anything?code=ABC&id=15
https://httpbin.org/anything?code=SDH&id=0

Cached data:
<CachedResponse [200]: created: 2023-03-24 16:16:56 CDT, expires: N/A (fresh), size: 448 bytes, request: GET https://httpbin.org/anything?code=ABC&id=15>
<CachedResponse [200]: created: 2023-03-24 16:16:57 CDT, expires: N/A (fresh), size: 452 bytes, request: GET https://httpbin.org/anything?code=$RF&id=191>
<CachedResponse [200]: created: 2023-03-24 16:16:57 CDT, expires: N/A (fresh), size: 446 bytes, request: GET https://httpbin.org/anything?code=SDH&id=0>

Step 9: Cache Invalidation

Sometimes, you might want to invalidate (delete) specific items from the cache. You can achieve this using the cache.delete() method:

from requests_cache import CachedSession

session = CachedSession('my_cache')

# Clear cache just to be safe
session.cache.clear()

response1 = session.get('https://httpbin.org/get')
response2 = session.get('https://httpbin.org/anything')
response3 = session.get('https://httpbin.org/base64/SFRUUEJJTiBpcyBhd2Vzb21l')

print('Cached URLS:')
print('\n'.join(session.cache.urls()))  # will print cached URLs

session.cache.delete(urls=['https://httpbin.org/anything'])

print('\nCached URLS after delete:')
print('\n'.join(session.cache.urls()))
Cached URLS:
https://httpbin.org/anything
https://httpbin.org/base64/SFRUUEJJTiBpcyBhd2Vzb21l
https://httpbin.org/get

Cached URLS after delete:
https://httpbin.org/base64/SFRUUEJJTiBpcyBhd2Vzb21l
https://httpbin.org/get

Advanced Features

You can further customize the caching behavior using cache filters and custom cache keys. This allows you to cache only specific requests or create more fine-grained caching rules.

Step 11: Cache Filters

You can use cache filters to control which requests should be cached. For example, you can cache only successful responses (HTTP status code 200):

from requests_cache import CachedSession

session = CachedSession('my_cache', allowable_codes=(200, 301))

# Clear cache just to be safe
session.cache.clear()

response1 = session.get('https://httpbin.org/status/200') # will be cached
response2 = session.get('https://httpbin.org/status/500') # will not be cached
response3 = session.get('https://httpbin.org/status/201') # will be cached
response4 = session.get('https://httpbin.org/status/404') # will not be cached

print('Cached URLS:')
print('\n'.join(session.cache.urls()))  # will print cached URLs
Cached URLS:
https://httpbin.org/status/200
https://httpbin.org/status/201

Step 12: Custom Cache Keys:

By default, requests-cache generates cache keys based on the request URL and parameters. You can create custom cache keys using a custom function. For example, you can cache requests based on specific query parameters:

from requests_cache import CachedSession
from requests import PreparedRequest
from urllib import parse


def custom_key_fn(request: PreparedRequest, **kwargs) -> str:
    # generate a custom cache_key using PreparedRequest. Here we will use param value
    param_value = parse.parse_qs(parse.urlparse(request.url).query)['id'][0]
    return param_value


session = CachedSession('my_cache', key_fn=custom_key_fn)

session.cache.clear()

response1 = session.get('https://httpbin.org/anything', params={"id": 55493})
response2 = session.get('https://httpbin.org/anything', params={"id": 237429})

print('Cached URLS:')
print('\n'.join(session.cache.urls()))  # will print cached URLs

print('\nCache keys:')
for response in session.cache.filter():
    print(response.cache_key)
Cached URLS:
https://httpbin.org/anything?id=237429
https://httpbin.org/anything?id=55493

Cache keys:
55493
237429

Step 13: Monitoring Cache Performance

To monitor the performance of your cache, you can use cache.responses.values()

from requests_cache import CachedSession

session = CachedSession('my_cache')

session.cache.clear()

response1 = session.get('https://httpbin.org/anything', params={"id": 55493})
response2 = session.get('https://httpbin.org/anything', params={"id": 237429})

print('Cached URLS:')
print('\n'.join(session.cache.urls()))  # will print cached URLs

print('\nCache details:')
for response in session.cache.responses.values():
    print('Request: {}'.format(response.url))
    print('Is response form cache: {}'.format(response.from_cache))
    print('When was cache created: {}'.format(response.created_at))
    print('When will cache expire: {}'.format(response.expires))
    print('Is cache expired: {}'.format(response.is_expired))
    print('Cache size: {} bytes'.format(response.size))
    print('------------------')
Cached URLS:
https://httpbin.org/anything?id=237429
https://httpbin.org/anything?id=55493

Cache details:
Request: https://httpbin.org/anything?id=237429
Is response form cache: True
When was cache created: 2023-03-24 22:56:45.027010
When will cache expire: None
Is cache expired: False
Cache size: 427 bytes
------------------
Request: https://httpbin.org/anything?id=55493
Is response form cache: True
When was cache created: 2023-03-24 22:56:44.965918
When will cache expire: None
Is cache expired: False
Cache size: 425 bytes
------------------

Conclusion:

This guide has provided you with an extensive understanding of caching external API requests in Python using the requests and requests-cache libraries. You now know how to set up, customize, debug and manage caching effectively. Additionally, you can implement advanced cache customization and monitor cache performance. Let me know your feedback or issues faced in comments.

Python3
Python Programming
Caching
API
Api Development
Recommended from ReadMedium