Mastering API Caching in Python: A Step-by-Step Guide
Effortlessly Improve Your App’s Performance — A 13 step tutorial to caching external APIs in Python while using ‘requests’ and ‘requests_cache’ library.
My plan for today is to lead you through the benefits of API caching and how to implement it using Python’s requests and requests-cache libraries. I have ensured that this article is suitable for all levels, providing clear instructions and in-depth coverage, ensuring you get a solid foundation in API caching and its best practices.
Introduction
API (Application Programming Interface) requests are essential for communicating with external services and fetching data. However, sending too many requests can slow down your application and potentially exceed API limits. This is avoidable if these are same request with same static responses. Here comes Caching, a technique to store and reuse API responses, reducing the need for repetitive requests and improving performance. In this beginner-friendly guide I wanted to introduce you to caching external API requests in Python using ‘requests’ library.
Understanding Caching
Caching is storing data temporarily so it can be quickly accessed and reused without making additional requests to the API. This helps reduce the number of requests, saves time and improves performance.
Step 1: Let’s set it up
Before getting started, you need to install a few necessary tools and libraries. You should already be having Python 3 and pip, the Python package manager. It’s advisable to use virtualenv. Then, install the requests and requests-cache libraries using the following command:
pip install requests requests-cache
Step 2: Let’s Fetch the data
First, let’s see how to fetch data without caching. You can use the requests library to send API requests in Python. Let’s see how to do that:
import requests
url = "https://dummyjson.com/products"
s = requests.Session()
r = s.get(url)
print(r.json())Step 3: Cache it!
Now, let’s add caching to the same example. You’ll need to import the requests_cache library and set up a cache with a specific duration. Here's how to do it:
import requests
import requests_cache
# Set up a cache that lasts for 5 minutes
requests_cache.install_cache("my_cache", expire_after=300)
# Clear cache just to be safe
requests_cache.clear()
response = requests.get('https://httpbin.org/json')
from_cache = getattr(response, 'from_cache', False)
print(f'Is request cached: {from_cache}') # False
response = requests.get('https://httpbin.org/json')
from_cache = getattr(response, 'from_cache', False)
print(f'Is request cached: {from_cache}') # TrueWhen you run this code, the API response will be cached for 5 minutes. Since we make another request within this timeframe, the cached data will be used instead of sending a new request.
Now, requests_cache does wrap all features of requests and can be used without explicitly using requests module. We will use that:
from requests_cache import CachedSession
session = CachedSession('my_cache')
# Clear cache just to be safe
session.cache.clear()
print('Cached URLS:')
print('\n'.join(session.cache.urls()))
response = session.get('https://httpbin.org/get')
print('Cached URLS:')
print('\n'.join(session.cache.urls()))Step 4: Customize the Cache
You can customize your cache settings by modifying the expire_after parameter. For example, you can set it to a longer duration, like 24 hours:
from requests_cache import CachedSession
session = CachedSession('my_cache', expire_after=86400) Step 5: Clear or bypass the cache
Sometimes, you may need to clear the cache or bypass it for a particular request.
To clear the entire cache, you must have noticed what I did in the example above:
from requests_cache import CachedSession
session = CachedSession('my_cache')
# Clear cache just to be safe
session.cache.clear()To bypass the cache for a specific request, you can use the cache_disabled() :
from requests_cache import CachedSession
session = CachedSession('my_cache', expire_after=86400)
# Clear cache just to be safe
session.cache.clear()
print('Cached URLS:')
print('\n'.join(session.cache.urls()))
with session.cache_disabled():
response = session.get('https://httpbin.org/get')
print('Cached URLS:')
print('\n'.join(session.cache.urls()))Step 6: Handle the errors
When working with caching, it’s essential to handle errors that might occur when the cache expires or is unavailable. You can use a try-except block to handle such cases. This might not work in newer versions of requests_cache. In those cases fallback to native exception objects.
import requests
import requests_cache
requests_cache.install_cache("my_cache", expire_after=300)
url = "https://dummyjson.com/products"
try:
response = requests.get(url)
print(response.json())
except requests_cache.exceptions.RequestsCacheError:
print("Error: Cache unavailable or expired.")Step 7: Let’s change Cache Storage Backends
By default, requests-cache uses SQLite for storing cache data. However, you can switch to other storage backends like Redis or in-memory storage. Here's an example of using Redis:
Install the redis library:
pip install redis
Steps to install redis in your system can he found here. Once installed, let’s set Redis as your cache storage in ourcode:
from requests_cache import CachedSession, RedisCache
backend = RedisCache(host='192.168.1.63', port=6379) #redis IP and port
session = CachedSession('my_cache', backend=backend)
# Clear cache just to be safe
session.cache.clear()
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will not print anything as we cleared cache
response = session.get('https://httpbin.org/get')
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will print cached URLsStep 8: Let’s debug
When working with caching, you may want to know if APIs were cached and if yes, what are the cached APIs. You can use cache.urls():
from requests_cache import CachedSession
session = CachedSession('my_cache')
# Clear cache just to be safe
session.cache.clear()
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will not print anything as we cleared cache
response1 = session.get('https://httpbin.org/get')
response2 = session.get('https://httpbin.org/anything')
response3 = session.get('https://httpbin.org/base64/SFRUUEJJTiBpcyBhd2Vzb21l')
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will print cached URLsStep 9: Let’s cache API Requests with Varying Parameters
In some cases, you might want to cache API requests with different parameters. requests-cache can handle this automatically. For example, if you have an API that accepts a query parameter, such as https://api.example.com/data?id=1, the cache will store responses
from requests_cache import CachedSession
session = CachedSession('my_cache')
# Clear cache just to be safe
session.cache.clear()
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will not print anything as we cleared cache
response1 = session.get('https://httpbin.org/anything', params={"id": 15, "code": "ABC"})
response2 = session.get('https://httpbin.org/anything', params={"id": 191, "code": "$RF"})
response3 = session.get('https://httpbin.org/anything', params={"id": 00, "code": "SDH"})
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will print cached URLs
print('\nCached data:')
for response in session.cache.filter():
print(response)
Cached URLS:
https://httpbin.org/anything?code=$RF&id=191
https://httpbin.org/anything?code=ABC&id=15
https://httpbin.org/anything?code=SDH&id=0
Cached data:
<CachedResponse [200]: created: 2023-03-24 16:16:56 CDT, expires: N/A (fresh), size: 448 bytes, request: GET https://httpbin.org/anything?code=ABC&id=15>
<CachedResponse [200]: created: 2023-03-24 16:16:57 CDT, expires: N/A (fresh), size: 452 bytes, request: GET https://httpbin.org/anything?code=$RF&id=191>
<CachedResponse [200]: created: 2023-03-24 16:16:57 CDT, expires: N/A (fresh), size: 446 bytes, request: GET https://httpbin.org/anything?code=SDH&id=0>Step 9: Cache Invalidation
Sometimes, you might want to invalidate (delete) specific items from the cache. You can achieve this using the cache.delete() method:
from requests_cache import CachedSession
session = CachedSession('my_cache')
# Clear cache just to be safe
session.cache.clear()
response1 = session.get('https://httpbin.org/get')
response2 = session.get('https://httpbin.org/anything')
response3 = session.get('https://httpbin.org/base64/SFRUUEJJTiBpcyBhd2Vzb21l')
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will print cached URLs
session.cache.delete(urls=['https://httpbin.org/anything'])
print('\nCached URLS after delete:')
print('\n'.join(session.cache.urls()))Cached URLS: https://httpbin.org/anything https://httpbin.org/base64/SFRUUEJJTiBpcyBhd2Vzb21l https://httpbin.org/get Cached URLS after delete: https://httpbin.org/base64/SFRUUEJJTiBpcyBhd2Vzb21l https://httpbin.org/get
Advanced Features
You can further customize the caching behavior using cache filters and custom cache keys. This allows you to cache only specific requests or create more fine-grained caching rules.
Step 11: Cache Filters
You can use cache filters to control which requests should be cached. For example, you can cache only successful responses (HTTP status code 200):
from requests_cache import CachedSession
session = CachedSession('my_cache', allowable_codes=(200, 301))
# Clear cache just to be safe
session.cache.clear()
response1 = session.get('https://httpbin.org/status/200') # will be cached
response2 = session.get('https://httpbin.org/status/500') # will not be cached
response3 = session.get('https://httpbin.org/status/201') # will be cached
response4 = session.get('https://httpbin.org/status/404') # will not be cached
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will print cached URLsCached URLS: https://httpbin.org/status/200 https://httpbin.org/status/201
Step 12: Custom Cache Keys:
By default, requests-cache generates cache keys based on the request URL and parameters. You can create custom cache keys using a custom function. For example, you can cache requests based on specific query parameters:
from requests_cache import CachedSession
from requests import PreparedRequest
from urllib import parse
def custom_key_fn(request: PreparedRequest, **kwargs) -> str:
# generate a custom cache_key using PreparedRequest. Here we will use param value
param_value = parse.parse_qs(parse.urlparse(request.url).query)['id'][0]
return param_value
session = CachedSession('my_cache', key_fn=custom_key_fn)
session.cache.clear()
response1 = session.get('https://httpbin.org/anything', params={"id": 55493})
response2 = session.get('https://httpbin.org/anything', params={"id": 237429})
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will print cached URLs
print('\nCache keys:')
for response in session.cache.filter():
print(response.cache_key)Cached URLS:
https://httpbin.org/anything?id=237429
https://httpbin.org/anything?id=55493
Cache keys:
55493
237429Step 13: Monitoring Cache Performance
To monitor the performance of your cache, you can use cache.responses.values()
from requests_cache import CachedSession
session = CachedSession('my_cache')
session.cache.clear()
response1 = session.get('https://httpbin.org/anything', params={"id": 55493})
response2 = session.get('https://httpbin.org/anything', params={"id": 237429})
print('Cached URLS:')
print('\n'.join(session.cache.urls())) # will print cached URLs
print('\nCache details:')
for response in session.cache.responses.values():
print('Request: {}'.format(response.url))
print('Is response form cache: {}'.format(response.from_cache))
print('When was cache created: {}'.format(response.created_at))
print('When will cache expire: {}'.format(response.expires))
print('Is cache expired: {}'.format(response.is_expired))
print('Cache size: {} bytes'.format(response.size))
print('------------------')
Cached URLS:
https://httpbin.org/anything?id=237429
https://httpbin.org/anything?id=55493
Cache details:
Request: https://httpbin.org/anything?id=237429
Is response form cache: True
When was cache created: 2023-03-24 22:56:45.027010
When will cache expire: None
Is cache expired: False
Cache size: 427 bytes
------------------
Request: https://httpbin.org/anything?id=55493
Is response form cache: True
When was cache created: 2023-03-24 22:56:44.965918
When will cache expire: None
Is cache expired: False
Cache size: 425 bytes
------------------Conclusion:
This guide has provided you with an extensive understanding of caching external API requests in Python using the requests and requests-cache libraries. You now know how to set up, customize, debug and manage caching effectively. Additionally, you can implement advanced cache customization and monitor cache performance. Let me know your feedback or issues faced in comments.





