Quota Strategies
This document outlines the available quota strategies with detailed explanations, examples, and guidance on usage.
ποΈ Fixed Window
The Fixed Window strategy is a rate limiting approach, where requests are restricted within a predefined time window. Once the limit is reached within the specified window, subsequent requests are blocked until the window resets.
ποΈ Concurrent
The Concurrent strategy sets a limit on the number of simultaneous requests rather than tracking requests within a time window. This approach helps manage concurrent traffic across various flows and endpoints, reducing server strain during high-traffic periods.
ποΈ Custom Quota
The Custom Quota strategy provides granular control over API traffic, enabling limits based on tokens and requests within a fixed time window.
Choosing the Right Strategyβ
When selecting a quota strategy, consider that as an API consumer, youβre typically bound to the quota configuration set by the API provider. For example, if the provider uses a Fixed Window rate-limiting strategy, then your application should be designed to align with these limits. However, for internal limits within your own systems, especially those nested under a parent quota, you have more flexibility to choose how best to structure them based on your specific traffic management needs.
Use Case | Recommended Strategy | Description |
---|---|---|
Steady Traffic Control | Fixed Window | Suitable for consistent rate-limiting needs, such as capping requests within a fixed time frame. |
Burst Handling | Fixed Window | Helps manage and contain traffic spikes within a defined window, preventing overload. |
High Concurrency | Concurrent | Limits the number of active requests, ideal for environments with high real-time demand. |
Token Based API Usage | Fixed Window Custom Counter | Controls requests and tokens to ensure efficient use of token-based APIs. |
User-Based Quotas | Fixed Window | Groups requests by user-level headers (e.g., x-lunar-consumer-tag ) for differentiated user or subscription-tier quotas. |
Service Load Balancing | Concurrent | Controls simultaneous connections to balance load across your resources dynamically. |
Example: Combining Strategies with Internal Limitsβ
In more complex scenarios, such as managing quotas across different user levels or services, users can configure internal limits that apply both Fixed Window and Concurrent strategies within the same configuration file.
quotas:
- id: CombinedQuota # Unique identifier for the main quota
filter:
url: api.website.com/* # URL pattern to apply the quota
strategy:
fixed_window: # Main quota strategy using Fixed Window
max: 5000 # Maximum requests allowed in the main quota window
interval: 1
interval_unit: day
group_by_header: x-lunar-consumer-tag # Optional grouping
# Nested concurrent limit for premium users within the main quota
internal_limits:
- id: PremiumConcurrentLimit
parent_id: CombinedQuota
filter:
headers:
- key: x-lunar-consumer-tag
value: premium
strategy:
concurrent:
max_request_count: 100 # Max concurrent requests for premium users
In this example we see that api.website.com/*
has a quota, CombinedQuota
of 5000 requests per day grouped by the header x-lunar-consumer-tag
using the Fixed Window Strategy. In additoin to this, we have limited the number of requests that can run at the same time, concurrently, to requests with the header x-lunar-consumer-tag: premium
. This means that we can sent 100 requests at the same time to the provider for premium users.