avatarKonstantin Borimechkov


This article discusses the concept of rate limiting, its use cases, and how to implement it in a Java API using Bucket4J.


The article begins by explaining the author's encounter with a rate limit error when using OpenAI's API, which led to an exploration of rate limiting. The author explains that rate limiting is used for cost control, conserving server resources, smoothing out traffic spikes, managing API gateway usage, preventing excessive API usage, and stopping users from exceeding their allowed limits. The article also highlights the importance of rate limiting in preventing bot attacks, such as DoS/DDoS attacks, brute force attacks, and web scraping. The author then discusses Cloudflare's perspective on rate limiting and its use in the real world, citing examples from OpenAI and Polygon.io. Finally, the article provides a code example of how to implement a rate limiter in a Java API using Bucket4J.


  • Rate limiting is important for cost control and preventing bot attacks.
  • Rate limiting can be used to conserve server resources and manage API gateway usage.
  • Cloudflare uses IP addresses to track unique machines and how much time elapses between each of the requests made by them.
  • Rate limiting can temporarily block IP addresses if there are too many requests within a given timeframe.
  • OpenAI and Polygon.io use rate limiting to price their API usage.
  • Bucket4J can be used to implement a rate limiter in a Java API.
  • The author believes that the most important use case of rate limiters is to prevent bot attacks.

Java API Optimization: Unleashing the Power of Rate Limiting with Bucket4J

I recently wrote a blog post, where I spoke about connecting to ChatGPT with Python via OpenAI’s API. At first, when I ran the code, an error occurred 👇

openai.error.RateLimitError: You exceeded your current quota, please check your plan and billing details.

Of course, the error is pretty self-explanatory - I don’t have credits/💵 in my OpenAI account. But besides that, I got the idea of diving deeper into the concept of rate-limiting and it’s implications!

🙇‍♂ What’s the use case & purpose of rate limiting?

The first, but not most important IMO, use case is to use rate limiters for cost-control! Just like OpenAI did in the scenario I faced above. By doing so, they 👇

  • Conserve server resources by restricting the number of requests processed within a specific time frame
  • Smooth out traffic spikes
  • Manage the usage of the API gateway and potentially choose a pricing tier that aligns with your needs
  • Prevent the app from excessive API usage, resulting in increased costs
  • Stop users or apps from going over their allowed limit, so you don’t get extra charges.

Rate-limiting basically motivates developers to pay for increased API usage beyond specified limits.

The most important use case of rate limiters, in my opinion, is to prevent your app from the so-called bot attacks . These attacks can have severe impact on your app, making it vulnerable to:

Of course, your app isn’t open to such vulnerabilities only by bot attacks, but by regular hacker as well! 🕵️‍♂️

There are a lot more use cases for the rate limiter, but I won’t go into much details in this blog post. I’m trying to keep the topic as simple as possible for now. If you are interested in diving deeper into the topic, check this blog post ⏬

Cloudflare on Rate-Limiting? 🤔

Rate limiting is a strategy for limiting network traffic. It puts a cap on how often someone can repeat an action within a certain timeframe — for instance, trying to log in to an account. - CloudFlare

Some main takeaways I got from the Cloudflare’s blog are:

  • Rate-limiting uses the IP address to track unique machines and how much time elapses between each of the requests made by them.
  • What rate-limiting does is tell unique users of the app to slow down, when they make a ton of request whiting a given time period.
  • RL temporarily blocks IP addresses if there are too many requests within the given timeframe.
  • Users may be temporarily locked out after multiple unsuccessful login attempts to prevent brute force attacks.
  • It protects against malicious bot attacks that could disrupt API services.

Rate limiting in the real world 🌍

As we’ve already spoken about it, lets take the OpenAI’s approach to using rate limiting in order to price their API usage. 💰

As you can see, in order to use ChatGPT’s 3.5-Turbo version, you have a set of limits to hit and based on them OpenAI has different pricing. Different models & different amounts of tokens, cost different amount of money per token.

On the other side, with more cleaner look, polygon.io offer more cleaner way of pricing their API usage 👇

As you can see, they have a free tier plan, where you are allowed to call their endpoints only 5 times per minute. If you want more frequent calls, you would have to pay some cash. 🤑

I’ve actually worked on a project, where I’ve made a Python script that uses their free tier to extract data about all 10,000+ available stocks on the market! 😅 The script ran in periods of 5 seconds and we ran it for more than 12 hours to get all data we needed 😆. That’s for another blog tho, funny times..

Let’s implement a rate-limiter on Java API! 🔥

At the end, every programmer loves to look into some code, so let’s create a simple rate limiter utilizing Bucket4J, Spring Boot and Java.

📝 Note that this is a semi-pseudo code and I won’t go into details about separation of configs from controllers and etc..

  1. Of course, before writing the code we need to add the bucket4j-core:8.5.0 dependency to your project
  2. We will mimic polygon.io’s free tier rate limit 👨‍💻
public class StocksController {

    private final Bucket rateLimiter;
    private final StockService stockService;

    public StocksController() {

        // Configuring the rate limiter to allow 5 requests per minute
        int limit = 5;
        Duration refillPeriod = Duration.ofSeconds(60);

        // Configure the rate limiter bucket
        Bandwidth limitBandwidth = Bandwidth.classic(limit, Refill.intervally(limit, refillPeriod));
        this.rateLimiter = Bucket4j.builder().addLimit(limitBandwidth).build();
        this.stockService = new StockService();

    public String getStocks() {
        // Check if request is allowed by the rate limiter
        if (rateLimiter.tryConsume(1)) {
            return stockService.getStocks();
        } else {
            // Rate limit exceeded, return 429 status
            throw new RateLimitExceededException();

In the example above, I tried to outline the key points in configuring the rate-limiter via code comments // . With that said the basic goal of the endpoint is to retrieve stock data with the limit of 5 requests/minute!

Enjoyed this post? Give it a round of applause!

👏 If you found value in these insights, a quick clap shows your support and helps others discover this content too. 🚀

🌟 Share the Knowledge: Spread the love by sharing this post with your friends, colleagues, or on your favorite social media platform. Let’s make learning and growth a collective experience! 🌐

🤔 Thoughts or Questions? We’d love to hear your thoughts and answer any questions you may have. Drop a comment below and join the conversation!

Thank you for being a part of our community. Your engagement means the world to us! 🙌

Happy learning and sharing! 🚀✨

Rate Limiting
Api Development
Programming Tips
Java Development
Recommended from ReadMedium