API Quotas Configuration Template
Quota Configuration Template
This template demonstrates how to configure quotas using Lunar.dev's system. It includes examples for fixed window, concurrent-based strategies, and the newly added fixed_window_custom_counter strategy. Users can define internal limits for more specific cases as needed.
note
File names for Lunar's YAML configurations don't need to follow a specific convention (e.g., quota.yaml). As long as the file is placed in the correct folder, Lunar.dev will automatically detect and apply it.
/etc/lunar-proxy/quotas/{fileName}.yaml
quotas:
  - id: MyQuota # Unique identifier for the quota
    filter: # Define filter conditions for this quota
      url: api.website.com/* # URL pattern to apply the quota
      headers: # Optional: Header-based filtering
        - key: x-lunar-consumer-tag # Header to filter on
          value: premium # Example value 1
        - key: x-lunar-consumer-tag # Header to filter on
          value: basic # Example value 2
    strategy: # Quota strategy definition
      fixed_window: # Fixed Window Strategy
        static:
          max: 1000 # Maximum requests allowed within the window
          interval: 24 # Duration of the time window
          interval_unit: hour # Unit of time for the interval (second/minute/hour/day/month)
        group_by_header: x-lunar-consumer-tag # Optional: Group by header value
        dynamic: # Dynamic Header-Based Configuration
          remaining_header: X-RateLimit-Limit
          reset_time_header: X-RateLimit-Reset
          retry_after_header: Retry-After
  - id:  # Unique identifier for the quota
    filter:
      url: api.openai.com/* # URL pattern to apply the quota
    strategy:
      fixed_window_custom_counter:
        max: 1000 # Maximum requests allowed within the window
        interval: 1
        interval_unit: second # Time window in second
        counter_value_path: | #JSONPath defining where to get count from
          $.request.headers["x-lunar-used-tokens"]
internal_limits: # Optional: Define nested child quotas within the main quota
  - id: MyChildQuota # Unique identifier for the child quota
    parent_id: MyQuota # Links the child quota to its parent
    filter:
      url: api.website.com/specific # URL pattern specific to the child quota
    strategy:
      fixed_window: # Strategy for managing child quota with a fixed window
        static:
          max: 500 # Maximum requests allowed within the child quota
          interval: 1 # Window duration for the child quota
          interval_unit: day # Unit of time for the interval (second/minute/hour/day/month)
  - id: PremiumQuota
    parent_id: MyQuota
    filter:
      headers:
        - key: x-lunar-consumer-tag
          value: premium
    strategy:
      allocation_percentage: 80 # Percentage allocation of the total requests
  - id: BasicQuota
    parent_id: MyQuota
    filter:
      headers:
        - key: x-lunar-consumer-tag
          value: basic
    strategy:
      fixed_window:
        static:
          max: 20000
          interval: 1
          interval_unit: day
        spillover:
          max: 100 # Enables carryover of unused quota to the next window
Key Components of the Quota Configuration
| Field | Description | Mandatory/Optional | Example | 
|---|---|---|---|
| quota.id | A unique identifier for the quota. | Mandatory | MyQuota | 
| filter.url | The URL pattern that the quota applies to. | Mandatory | api.website.com/* | 
| filter.headers.key | Optional header used for filtering requests. | Optional | x-lunar-consumer-tag | 
| filter.headers.value | Value of the header to match for filtering. | Optional | 'premium' | 
| strategy.fixed_window.static.max | Maximum number of requests allowed within the window. | Mandatory (if used) | 1000 | 
| strategy.fixed_window.static.interval | Time window duration for the quota. | Mandatory (if used) | 24 | 
| strategy.fixed_window.static.interval_unit | Unit of time for the window (second, minute, hour, day, month). | Mandatory (if used) | hour | 
| strategy.fixed_window.group_by_header | Group quota by the value of a specific header. | Optional | x-lunar-consumer-tag | 
| strategy.fixed_window.dynamic.remaining_header | Header exposing remaining quota for dynamic configurations. | Optional | X-RateLimit-Limit | 
| strategy.fixed_window.dynamic.reset_time_header | Header indicating reset time for dynamic quotas. | Optional | X-RateLimit-Reset | 
| strategy.fixed_window.dynamic.retry_after_header | Header for retry-after details in dynamic quotas. | Optional | Retry-After | 
| strategy.concurrent.max_request_count | Max concurrent requests allowed. | Mandatory (if used) | 50 | 
| strategy.concurrent.remaining_header | Header to expose remaining concurrent requests. | Optional | X-Concurrent-Remaining | 
| strategy.fixed_window_custom_counter.max | Max requests allowed in the window. | Mandatory | 100 | 
| strategy.fixed_window_custom_counter.counter_value_path | JSONPath directions for where to find count. | Mandatory | $.request.headers["x-lunar-used-tokens"]  | 
| strategy.fixed_window_custom_counter.interval | Time window duration for LLM quota. | Mandatory | 1 | 
| strategy.fixed_window_custom_counter.interval_unit | Time unit for LLM window. | Mandatory | minute | 
| internal_limits.id | Unique identifier for child quotas. | Optional | MyChildQuota | 
| internal_limits.parent_id | Links the child quota to its parent quota. | Optional | MyQuota | 
| internal_limits.filter.url | URL pattern that the child quota applies to. | Mandatory (if used) |