3 Caching Problems Every Developer Should Know
Cache penetration, cache breakdown and cache avalanche
While caching is known to be a performance booster for every software system, it’s also prone to bugs if not handled with care.
In this article, I will walk you through 3 common caching problems that could be disastrous at times so you don’t need to learn them the hard way.
Let’s get started!
Cache Penetration
A cache penetration happens when a searched key does not reside in both the database and the cache.
Let’s take a look at how it works.
- It happens when a key does not reside in the cache nor the database.
- When users query for the key, the application hits the database due to cache miss.
- Since the database does not contain the key and returns an empty result, the key is never cached.
- Hence, every query will eventually result in cache miss and hit the database.
While this might seem trivial at first, an attacker can attempt to bring down your database by launching lots of searches with such keys.
To mitigate such issues, we can
- Cache the empty result with a short expiration time.
- Employ a bloom filter. Before querying the database, the application looks up the key in a bloom filter and returns immediately if the key does not exist.
Cache Breakdown (Thundering herd problem)
A cache breakdown happens when a cache key expires, and multiple requests access the database concurrently looking for the same key.
Let’s take a look at how it works
- A hot cache key expires.
- Multiple concurrent requests come in searching for the same key.
- The servers launch multiple concurrent requests to the database to look for the same key.
A cache breakdown increases the load on the database dramatically especially when lots of hot keys expire at the same time.
Here are some mitigation plans
- Acquire a lock on the searched key. Other threads need to wait when a thread is trying to update the cache.
- Utilise Refresh-ahead strategy to asynchronously refresh hot data so that the hot keys never expire.
- Use Read-through strategy to move the data fetching logic to the cache and ensures that the cache fires only one request to the database for each query.
Cache Avalanche
A cache avalanche happens when there’s a sudden spike of requests to the database.
This happens when
- Lots of cached data expire at the same time.
- A cache service goes down, and all requests access the database directly.
A sudden spike of traffic to the database might result in cascading effects and might eventually bring down your service.
Here are some mitigation plans
- Adjust the expiration time for the cached keys so that they won’t expire at the same time.
- Use Refresh-ahead strategy to asynchronously refresh hot data so that they never expire.
- Use cache clusters to avoid single point of failure. When a master crashes, one of the replicas is promoted as the new master.
Conclusion
While these problem might seem trivial at first, it might at times lead to cascading effects to our downstream clients and dependencies.
Knowing them beforehand allows us to design a more robust system and also eases our troubleshooting journey.
I hope you find this helpful, and I will see you at the next one!
If you are interested in articles like this, join me and sign up for Medium today!
Level Up Coding
Thanks for being a part of our community! Before you go:
- 👏 Clap for the story and follow the author 👉
- 📰 View more content in the Level Up Coding publication
- 🔔 Follow us: Twitter | LinkedIn | Newsletter
🚀👉 Join the Level Up talent collective and find an amazing job