Skip to main content
Version: Next

Quota Strategies

This document outlines the available quota strategies with detailed explanations, examples, and guidance on usage.

Fixed Window Strategy

The Fixed Window strategy is a rate limiting approach, where requests are restricted within a predefined time window. Once the limit is reached within the specified window, subsequent requests are blocked until the window resets.

Defines a maximum number of requests (max) allowed within a specific interval. The interval is configured by specifying the duration (e.g., 24 hours), after which the counter resets. Can be grouped by a specific header, such as x-lunar-consumer-tag, allowing limits to be applied per user or subscription level.

Β 

/etc/lunar-proxy/quotas/{fileName}.yaml
quotas:
- id: FixedWindowQuota # Unique identifier for the quota
filter: # Define filter conditions for this quota
url: api.website.com/* # URL pattern to apply the quota
strategy:
fixed_window:
max: 1000 # Maximum requests allowed within the window
interval: 24 # Window duration
interval_unit: hour # Unit of time for the interval (second/minute/hour/day/month)
group_by_header: x-lunar-consumer-tag # Optional: Group by header

Concurrent Strategy

The Concurrent strategy sets a limit on the number of simultaneous requests rather than tracking requests within a time window. This approach helps manage concurrent traffic across various flows and endpoints, reducing server strain during high-traffic periods.

Defines a maximum allowed concurrent requests limit (max_request_count). If the limit is reached, additional requests are blocked until active requests finish, freeing up capacity for new requests.

Β 
/etc/lunar-proxy/quotas/{fileName}.yaml
quotas:
- id: ConcurrentQuota # Unique identifier for the quota
filter: # Define filter conditions for this quota
url: api.website.com/* # URL pattern to apply the quota
strategy:
concurrent:
max_request_count: 50 # Maximum concurrent requests allowed

Choosing the Right Strategy​

When selecting a quota strategy, consider that as an API consumer, you’re typically bound to the quota configuration set by the API provider. For example, if the provider uses a Fixed Window rate-limiting strategy, then your application should be designed to align with these limits. However, for internal limits within your own systems, especially those nested under a parent quota, you have more flexibility to choose how best to structure them based on your specific traffic management needs.

Use CaseRecommended StrategyDescription
Steady Traffic ControlFixed WindowSuitable for consistent rate-limiting needs, such as capping requests within a fixed time frame.
Burst HandlingFixed WindowHelps manage and contain traffic spikes within a defined window, preventing overload.
High ConcurrencyConcurrentLimits the number of active requests, ideal for environments with high real-time demand.
User-Based QuotasFixed WindowGroups requests by user-level headers (e.g., x-lunar-consumer-tag) for differentiated user or subscription-tier quotas.
Service Load BalancingConcurrentControls simultaneous connections to balance load across your resources dynamically.

Advanced Example: Combining Strategies with Internal Limits​

In more complex scenarios, such as managing quotas across different user levels or services, users can configure internal limits that apply both Fixed Window and Concurrent strategies within the same configuration file.

/etc/lunar-proxy/quotas/{fileName}.yaml
quotas:
- id: CombinedQuota # Unique identifier for the main quota
filter:
url: api.website.com/* # URL pattern to apply the quota

# Main quota strategy using Fixed Window
strategy:
fixed_window:
max: 5000 # Maximum requests allowed in the main quota window
interval: 1
interval_unit: day
group_by_header: x-lunar-consumer-tag # Optional grouping

# Nested concurrent limit for premium users within the main quota
internal_limits:
- id: PremiumConcurrentLimit
parent_id: CombinedQuota
filter:
headers:
- key: x-lunar-consumer-tag
value: premium
strategy:
concurrent:
max_request_count: 100 # Max concurrent requests for premium users