Rate Limit Concurrency Pattern with Unit Tests — (4) Token Bucket

In the previous articles, I mentioned four rate-limit algorithms, how to implement rate-limit patterns and provided real-life examples for fixed window algorithm and slide window algorithm. This post will provide an example of using the token bucket to manage a content delivery network and an API requests control example for implementing the Token Bucket Algorithm.

If you don’t know the rate limit pattern, please view

Rate Limit Concurrency Pattern with Unit Tests — (1) Overview Four Algorithms

Rate Limit Concurrency Pattern with Unit Tests — (2) Fixed Window

Rate Limit Concurrency Pattern with Unit Tests — (3) Slide Window

Rate Limit Concurrency Pattern with Unit Tests — (4) Token Bucket

Rate Limit Concurrency Pattern with Unit Tests — (5) Leaky Bucket

As the following, I will use the implement steps that I mentioned in Rate Limit Concurrency Pattern with Unit Tests — (2) Fixed Window. If you don’t know them, please view the article above.

First Instance

Imagine a CDN needs to limit the bandwidth used by individual users to prevent any single user from consuming excessive resources, which could lead to network congestion and degraded service for others.

This file named impl.go contains the implementation of the token bucket rate-limiting logic.

package cdntraffic

import (
    "fmt"
    "net/http"
    "sync"
    "time"
)

const (
    capacity   = 5
    refillRate = 1
)

type TokenBucket struct {
    Capacity   int
    Tokens     int
    LastRefill time.Time
    RefillRate int // Tokens added per minute
    Mutex      sync.Mutex
}

func NewTokenBucket(capacity int, refillRate int) *TokenBucket {
    return &TokenBucket{
       Capacity:   capacity,
       Tokens:     capacity, // Start with a full bucket
       LastRefill: time.Now(),
       RefillRate: refillRate,
    }
}

func (tb *TokenBucket) Refill() {
    tb.Mutex.Lock()
    defer tb.Mutex.Unlock()

    now := time.Now()
    duration := now.Sub(tb.LastRefill) // Sub returns the duration currentTime-parameterTime.
    refillTokens := int(duration.Minutes()) * tb.RefillRate

    tb.Tokens += refillTokens
    if tb.Tokens > tb.Capacity {
       tb.Tokens = tb.Capacity
    }
    tb.LastRefill = now
}

func (tb *TokenBucket) AllowRequest() bool {
    tb.Refill()

    if tb.Tokens > 0 {
       tb.Tokens--
       return true
    }
    return false
}

func BandwidthLimitMiddleware(tb *TokenBucket, next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
       if !tb.AllowRequest() {
          http.Error(w, "Bandwidth limit exceeded", http.StatusTooManyRequests)
          return
       }

       next.ServeHTTP(w, r)
    })
}

func ContentHandler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "Content delivered successfully")
    // Do some real logic
}

In the above code, TokenBucket struct manages the tokens and refill logic. BandwidthLimitMiddleware enforces the bandwidth limit before processing content requests. And define the capacity and refillRate as consts, make them to update easily. now.Sub(tb.LastRefill) invoked the Sub, it is used to return the duration currentTime-parameterTime.

The following code is the content of the file named impl_test.go.

package cdntraffic

import (
    "net/http"
    "net/http/httptest"
    "testing"
    "time"
)

func TestBandwidthLimitMiddleware(t *testing.T) {
    tests := []struct {
       name             string
       numberOfRequests int
       expectedStatus   int
    }{
       {"ImmediateRequest", 1, http.StatusOK},
       {"ExceedLimit", 6, http.StatusTooManyRequests}, // 6 requests to exceed the limit of 5
       {"AfterRefill", 1, http.StatusOK},
    }

    for _, tt := range tests {
       t.Run(tt.name, func(t *testing.T) {
          bucket := NewTokenBucket(capacity, refillRate) // Reset bucket for each test

          handler := BandwidthLimitMiddleware(bucket, http.HandlerFunc(ContentHandler))

          var lastStatus int
          for i := 0; i < tt.numberOfRequests; i++ {
             req := httptest.NewRequest("GET", "/content", nil)
             rr := httptest.NewRecorder()
             handler.ServeHTTP(rr, req)
             lastStatus = rr.Code

             if tt.name == "AfterRefill" {
                time.Sleep(1 * time.Minute) // Wait for tokens to refill
             }
          }

          if lastStatus != tt.expectedStatus {
             t.Errorf("%s: handler returned wrong status code: got %v want %v", tt.name, lastStatus, tt.expectedStatus)
          }
       })
    }
}

In the above code, the test cases in impl_test.go simulate different scenarios to ensure the bandwidth limit works as expected under varying conditions. I will explain the ExceedLimit. The “ExceedLimit” test case simulates making enough requests to deplete the tokens in the bucket and then make an additional request that should be denied. The “ExceedLimit” test now makes 6 requests, which should exceed the bucket’s capacity of 5 tokens. A new TokenBucket instance is created for each test case to ensure isolation and reset the state. If it does not have this line, all tests are not isolated they won’t run correctly.

Let’s execute the command “go test . -v” to run tests. The test screenshot is below. As tests should wait 1 minute for the token bucket refill, test cases finish always over 1 minute.

Second Instance

Assume an API service needs to limit the number of requests a user can make per second to prevent overuse or abuse, ensuring fair access and stable performance for all users.

This file named impl.go contains the implementation of the token bucket rate-limiting logic.

package apictrl

import (
    "fmt"
    "net/http"
    "sync"
    "time"
)

const (
    capacity   = 5
    refillRate = 1
)

type TokenBucket struct {
    Capacity   int
    Tokens     int
    RefillRate int // Tokens added per second
    Mutex      sync.Mutex
    LastRefill time.Time
}

func NewTokenBucket(capacity, refillRate int) *TokenBucket {
    return &TokenBucket{
       Capacity:   capacity,
       Tokens:     capacity, // Start with a full bucket
       RefillRate: refillRate,
       LastRefill: time.Now(),
    }
}

func (tb *TokenBucket) Refill() {
    tb.Mutex.Lock()
    defer tb.Mutex.Unlock()

    now := time.Now()
    elapsed := now.Sub(tb.LastRefill).Seconds()
    refillTokens := int(elapsed) * tb.RefillRate

    tb.Tokens += refillTokens
    if tb.Tokens > tb.Capacity {
       tb.Tokens = tb.Capacity
    }
    tb.LastRefill = now
}

func (tb *TokenBucket) AllowRequest() bool {
    tb.Refill()

    if tb.Tokens > 0 {
       tb.Tokens--
       return true
    }
    return false
}

func RateLimitMiddleware(tb *TokenBucket, next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
       if !tb.AllowRequest() {
          http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
          return
       }

       next.ServeHTTP(w, r)
    })
}

func APIHandler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprintf(w, "API request successful")

    // Do some real logic
}

In the above code, TokenBucket struct manages the tokens and refill logic. RateLimitMiddleware enforces the rate limit before processing API requests. The Refill method calculates the amount of time that has elapsed since the last refill. now.Sub(tb.LastRefill).Seconds() computes the number of seconds that have passed since the last time tokens were added to the bucket. refillTokens := int(elapsed) * tb.RefillRate determines the number of new tokens to add. The refill rate specifies how many tokens are added per second.

The following code is the content of the file named impl_test.go.

package apictrl

import (
    "net/http"
    "net/http/httptest"
    "testing"
    "time"
)

func TestRateLimitMiddleware(t *testing.T) {
    tests := []struct {
       name             string
       numberOfRequests int
       delay            time.Duration
       expectedStatus   int
    }{
       {"WithinLimit", 3, 0, http.StatusOK},
       {"ExceedLimit", 6, 0, http.StatusTooManyRequests},
       {"AfterRefill", 1, 10 * time.Second, http.StatusOK},
    }

    for _, tt := range tests {
       t.Run(tt.name, func(t *testing.T) {
          limiter := NewTokenBucket(capacity, refillRate) // Reset bucket for each test

          handler := RateLimitMiddleware(limiter, http.HandlerFunc(APIHandler))

          for i := 0; i < tt.numberOfRequests; i++ {
             req := httptest.NewRequest("GET", "/api", nil)
             rr := httptest.NewRecorder()

             handler.ServeHTTP(rr, req)

             if i == tt.numberOfRequests-1 {
                if status := rr.Code; status != tt.expectedStatus {
                   t.Errorf("%s: handler returned wrong status code: got %v want %v",
                      tt.name, status, tt.expectedStatus)
                }
             }

             time.Sleep(tt.delay) // Delay for the next request
          }
       })
    }
}

In the above code, the test cases in impl_test.go simulate different scenarios to ensure the API requests limit works as expected under varying conditions. I will explain the AfterLimit. The “AfterRefill” test case in impl_test.go is designed to test the behaviour of the rate-limiting system after a delay, which allows the token bucket to refill. This test ensures that the rate limiter correctly replenishes its tokens over time, allowing new requests after the specified refill period. The test first simulates a number of requests up to the rate limit. After these initial requests, the test introduces a delay (time.Sleep) that corresponds to the refill period of the token bucket. This delay is crucial as it simulates the passage of time during which the bucket is supposed to refill its tokens. During this delay, the token bucket’s Refill method should add new tokens to the bucket. The rate at which tokens are added depends on the RefillRate defined in the bucket. After the delay, the test makes another request.

Let’s execute the command “go test . -v” to run tests. The test screenshot is below. As I set the delay as 10*second in the AfterLimit, the whole test cases finish after 10 seconds later.

Conclusion

The Token Bucket Algorithm is a network traffic and rate-limiting mechanism that controls data flow by allocating tokens at a steady rate into a virtual bucket. Each token represents permission to transmit a certain amount of data. When a user makes a request or sends data, tokens are consumed. If the bucket runs out of tokens, the user must wait until new tokens are generated before making additional requests or sending more data. This algorithm allows for bursts of activity up to the bucket’s capacity, followed by a regulated flow, balancing fair resource allocation and preventing network congestion or resource overuse.

I will provide the last algorithm scenario examples in the following article.

Perhaps you’re also interested in the following articles.

Go Back to Concurrency Design Patterns, please click here.

To View Design Patterns in Golang, please click here.

To view Creational Design Patterns in Golang, please click here.

To View Structural Design Patterns in Golang, please click here.

To View Behavioural Design Patterns in Golang, please click here.

To View Microservices Design Patterns in Golang, please click here.