DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Why I Ditched Redis for Cloudflare Durable Objects in My Rate Limiter
  • Scaling in Practice: Caching and Rate-Limiting With Redis and Next.js
  • Stateless JWT Auth Microservice Architecture With Spring Boot 3 and Redis Sentinel
  • Building a Reusable Framework to Standardize API Ingestion in an On-Prem Lakehouse

Trending

  • How AI Is Transforming Software Engineering and How Developers Can Take Advantage
  • A Scalable Framework for Enterprise Salesforce Optimization: Turning Outcomes Into an Operating System
  • The Update Problem REST Doesn't Solve
  • Unlocking Smart Meter Insights with Smart Datastream
  1. DZone
  2. Software Design and Architecture
  3. Microservices
  4. Rate Limiting Strategies With Redis: Fixed Window, Sliding Window, and Token Bucket

Rate Limiting Strategies With Redis: Fixed Window, Sliding Window, and Token Bucket

Redis enables Fixed Window, Sliding Window, and Token Bucket rate limiting, each balancing implementation simplicity, enforcement precision, and efficiency.

By 
Vineet Bhatkoti user avatar
Vineet Bhatkoti
·
Mar. 23, 26 · Analysis
Likes (2)
Comment
Save
Tweet
Share
1.6K Views

Join the DZone community and get the full member experience.

Join For Free

The Rate Limiting Problem

API rate limiting protects services from overload, prevents abuse, and ensures fair resource distribution across clients. Without rate limiting, a single high-volume client can degrade performance for all users, whether through malicious attacks or unintentional bugs causing request loops.

Traditional approaches, like in-memory counters, fail in distributed systems. When multiple API servers handle requests, each maintains separate counts, making it impossible to enforce consistent limits. Redis solves this by providing a centralized, fast key-value store that all servers can query atomically.

Three Redis-based patterns address this distributed rate limiting challenge, each with clear use cases: Fixed Window for internal APIs, Sliding Window for strict enforcement, and Token Bucket for production systems.

1. Fixed Window

The Fixed Window algorithm divides time into discrete intervals (windows) and counts requests within each window. When a window expires, the counter resets.

Fixed window rate limiting


Redis Structure: String With INCR

Fixed Window uses Redis Strings with the INCR command for atomic counter increments. The key includes a timestamp component that changes every window interval, ensuring counters reset automatically as windows roll over. The key pattern typically looks like: rate_limit:{user_id}:{window_id}.

Implementation approach:

Java
 
import redis.clients.jedis.Jedis;

public class FixedWindowRateLimiter {

private final Jedis jedis;

public FixedWindowRateLimiter(Jedis jedis) {
  this.jedis = jedis;
}

public boolean isAllowed(String userId, int limit, int windowSeconds) {
  long windowId = System.currentTimeMillis() / 1000 / windowSeconds;
  String key = "rate_limit:" + userId + ":" + windowId;

  long current = jedis.incr(key);

  if (current == 1) {
    jedis.expire(key, windowSeconds);
  }

  return current <= limit;
  }
}

// Usage: Allow 100 requests per 60-second window
FixedWindowRateLimiter limiter = new FixedWindowRateLimiter(new Jedis("localhost"));
boolean allowed = limiter.isAllowed("user_123", 100, 60);


  • Use INCR to atomically increment the counter for the current window
  • Set expiration (TTL) on first request in a new window to auto-cleanup old keys
  • Window ID calculated by dividing the current timestamp by the window duration

Each window gets its own Redis key with a simple integer counter. INCR is one of Redis's fastest operations, making this approach extremely performant. The expiration ensures old window keys are automatically removed without manual cleanup.

The burst problem: Fixed windows allow twice the intended rate at window boundaries. If a user sends 100 requests at 12:00:59 and another 100 at 12:01:01, they've sent 200 requests in 2 seconds, which might be well above the intended rate of 100 per minute.

This behavior stems from the window reset. At 12:01:00, the counter drops to zero regardless of recent activity. For APIs where strict rate enforcement matters (payment processing, resource-intensive operations), this burst tolerance is problematic.

When to use Fixed Window:

  • Simple rate-limiting requirements where burst tolerance is acceptable
  • Systems with low traffic where burst scenarios are unlikely
  • Internal APIs where clients are trusted
  • Development and testing environments

Advantages:

  • Minimal Redis operations (single INCR per request)
  • Low memory footprint (one integer per user per window)
  • Simple to understand and debug
  • Excellent performance characteristics

Limitations:

  • Burst problem at window boundaries
  • Imprecise enforcement of rate limits
  • Not suitable for strict rate requirements

2. Sliding Window

The Sliding Window algorithm maintains precise request timestamps and counts only requests within a moving time window. This eliminates the burst problem by considering exact request timing.

Sliding window rate limiting


Redis Structure: Sorted Set (ZSET)

Sliding Window leverages Redis Sorted Sets, where each request timestamp serves as both the member and the score. This data structure enables efficient range queries and automatic ordering by timestamp.

Implementation approach:

Java
 
import redis.clients.jedis.Jedis;

public class SlidingWindowRateLimiter {

private final Jedis jedis;

public SlidingWindowRateLimiter(Jedis jedis) {
  this.jedis = jedis;
}

public boolean isAllowed(String userId, int limit, int windowSeconds) {
  String key = "rate_limit:sliding:" + userId;
  long now = System.currentTimeMillis();
  long windowStart = now - (windowSeconds * 1000L);

  // Remove entries outside the window
  jedis.zremrangeByScore(key, 0, windowStart);

  // Count requests in current window
  long current = jedis.zcard(key);

  if (current < limit) {
    // Add current request timestamp
    jedis.zadd(key, now, String.valueOf(now));
    jedis.expire(key, windowSeconds);
    return true;
  }

  return false;
  }
}

// Usage: Allow 100 requests per 60-second sliding window
SlidingWindowRateLimiter limiter = new SlidingWindowRateLimiter(new Jedis("localhost"));
boolean allowed = limiter.isAllowed("user_123", 100, 60);


  • Use ZADD to store each request with a timestamp as a score
  • Use ZREMRANGEBYSCORE to remove entries outside the current window
  • Use ZCARD to count remaining requests in the window
  • Key pattern: rate_limit:sliding:{user_id}

Sorted Sets maintain an ordered collection of unique members with associated scores. For rate limiting, each request gets added with its timestamp. The ZREMRANGEBYSCORE command efficiently removes all entries with scores (timestamps) older than the window boundary. ZCARD then counts how many requests remain in the active window.

The memory problem: Each request creates a sorted set entry that persists for the full window duration. For a user making 1,000 requests per minute, this creates 1,000 entries in Redis. At scale with thousands of users, memory consumption becomes significant.

When to use Sliding Window:

  • Strict rate enforcement requirements (payments, critical operations)
  • Low to moderate request volumes per user
  • Systems with sufficient Redis memory
  • APIs where burst prevention is critical

Advantages:

  • Precise rate limiting with no burst allowance
  • Accurate at any point in time
  • Provides granular visibility into request patterns
  • Sorted Sets support efficient time-based queries

Limitations:

  • High memory usage (stores every request timestamp)
  • Multiple Redis operations per request (ZREMRANGEBYSCORE, ZCARD, ZADD)
  • Not cost-effective for high-volume APIs
  • Requires monitoring and capacity planning
  • Memory grows linearly with request rate

3. Token Bucket

The Token Bucket algorithm maintains a bucket that fills with tokens at a constant rate. Each request consumes a token. When the bucket is empty, requests are denied. This pattern allows controlled bursts while maintaining average rate limits.

Token bucket rate limiting


Redis Structure: Hash (HSET/HGET)

Token Bucket uses Redis Hashes to store two critical values: the current token count and the last refill timestamp. Hashes provide efficient storage for multiple related fields under a single key.

Implementation approach:

Java
 
import redis.clients.jedis.Jedis;
import java.util.Map;

public class TokenBucketRateLimiter {

private final Jedis jedis;

public TokenBucketRateLimiter(Jedis jedis) {
  this.jedis = jedis;
}

public boolean isAllowed(String userId, double rate, double capacity) {
  String key = "rate_limit:bucket:" + userId;
  double now = System.currentTimeMillis() / 1000.0;

  Map<String, String> bucket = jedis.hgetAll(key);

  double tokens;
  double lastRefill;

  if (bucket.isEmpty()) {
    tokens = capacity - 1;
    lastRefill = now;
  } else {
    tokens = Double.parseDouble(bucket.get("tokens"));
    lastRefill = Double.parseDouble(bucket.get("last_refill"));

    // Calculate tokens to add based on elapsed time
    double elapsed = now - lastRefill;
    double tokensToAdd = elapsed * rate;
    tokens = Math.min(capacity, tokens + tokensToAdd);
  }

  if (tokens >= 1) {
    tokens -= 1;
    jedis.hset(key, "tokens", String.valueOf(tokens));
    jedis.hset(key, "last_refill", String.valueOf(now));
    jedis.expire(key, (long) (capacity / rate) * 2);
    return true;
   }

  return false;
  }
}

// Usage: Refill at 10 tokens/second, bucket capacity of 100
TokenBucketRateLimiter limiter = new TokenBucketRateLimiter(new Jedis("localhost"));
boolean allowed = limiter.isAllowed("user_123", 10, 100);


  • Use HGETALL to retrieve current state (tokens, last_refill timestamp)
  • Calculate tokens to add based on elapsed time × refill rate
  • Use HSET to update both tokens and last_refill timestamp atomically
  • Key pattern: rate_limit:bucket:{user_id}

A Redis Hash stores the bucket state as two fields. On each request, the algorithm retrieves the current state, calculates elapsed time since last refill, adds tokens proportionally (elapsed × rate), and caps at bucket capacity. If tokens ≥ 1, consume one token and allow the request. HSET updates both fields in a single operation.

Burst control: Unlike Fixed Window, bursts are bounded by bucket capacity. A user can burst up to the capacity limit, then must wait for a token refill. This provides flexibility for legitimate burst traffic while preventing sustained overuse.

Why Hashes work well:

  • Store multiple related values under one key (tokens + timestamp)
  • Atomic updates via HSET for multiple fields
  • Memory efficient compared to storing separate keys
  • HGETALL retrieves all fields in a single round trip
  • Field-level operations without affecting other fields

When to use Token Bucket:

  • Production APIs requiring both rate limiting and burst tolerance
  • Systems with variable request patterns
  • APIs where occasional bursts are legitimate (batch operations, retry logic)
  • High-traffic services requiring efficient rate limiting

Advantages:

  • Balanced approach: allows controlled bursts, prevents sustained overuse
  • Efficient memory usage (two hash fields per user, regardless of request rate)
  • Minimal Redis operations per request (HGETALL + HSET)
  • Hash structure is intuitive and debuggable

Limitations:

  • More complex to implement than the Fixed Window
  • Requires tuning both rate and capacity parameters
  • Slightly less precise than Sliding Window for strict rate enforcement

Conclusion

Redis-based rate limiting provides scalable, distributed enforcement across APIs. Fixed Window offers simplicity at the cost of precision. Sliding Window delivers accuracy at the cost of memory. Token Bucket balances both, making it the preferred choice for production systems.

Fixed Window should be used for internal APIs or development environments. Token Bucket should be used when deploying to production or handling significant traffic, and reserve Sliding Window for scenarios where strict rate enforcement justifies the memory cost.

The pattern you choose matters less than consistent implementation and monitoring. Any of these approaches, properly configured and observed, will protect APIs from overload and abuse.

rate limit Redis (company)

Opinions expressed by DZone contributors are their own.

Related

  • Why I Ditched Redis for Cloudflare Durable Objects in My Rate Limiter
  • Scaling in Practice: Caching and Rate-Limiting With Redis and Next.js
  • Stateless JWT Auth Microservice Architecture With Spring Boot 3 and Redis Sentinel
  • Building a Reusable Framework to Standardize API Ingestion in an On-Prem Lakehouse

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook