Flow Example
Coming soon!
Flow Configuration
/etc/lunar-proxy/flows/flow.yaml
name: CountRequestTokens
filter:
url: api.openai.com/v1/chat/completions
body:
- key: model
regex_match: "gpt-4-*"
processors:
CountLLMTokens:
processor: CountLLMTokens
parameters:
- key: store_count_header
value: "x-lunar-estimated-tokens"
- key: model
value: "gpt-4-*"
flow:
request:
- from:
stream:
name: globalStream
at: start
to:
processor:
name: CountLLMTokens
- from:
processor:
name: CountLLMTokens
to:
stream:
name: globalStream
at: end
response:
- from:
stream:
name: globalStream
at: start
to:
stream:
name: globalStream
at: end
Flow Parameters (flow.yaml
)
Parameter | Example Value | Description |
---|---|---|
name | ClientSideLimitingFlow | The name of the flow, used to identify the flow in configurations. |
filter.url | api.website.com/* | Specifies the URL pattern for the API endpoints to which the flow applies. |
processors.RateLimiter.processor | RateLimiter | Defines the processor responsible for applying the rate limiting logic. |
processors.RateLimiter.parameters.key | quota_id | Specifies the key for the quota being used. |
processors.RateLimiter.parameters.value | MyQuota | The ID of the quota from the resource configuration. |
processors.GenerateResponseLimitExceeded.processor | GenerateResponse | Defines the processor that generates the response when a request exceeds the quota. |
processors.GenerateResponseLimitExceeded.parameters.key | status , body , Content-Type | Keys that define the response status code, message body, and content type when the quota is exceeded. |
/etc/lunar-proxy/quota/quota.yaml
quotas:
- id: MyQuota
filter:
url: api.website.com/*
strategy:
fixed_window:
static:
max: 100
interval: 1
interval_unit: minute
Quota Parameters
Parameter | Example Value | Description |
---|---|---|
quota_id | MyQuota | The identifier for the specific quota being applied to the flow. |
filter.url | api.website.com/* | Filters API requests by URL for applying the quota. |
strategy.fixed_window.max | 100 | The maximum number of allowed requests within a defined time window. |
strategy.fixed_window.interval | 1 | The size of the time window in which requests are counted. |
strategy.fixed_window.interval_unit | minute | Defines the unit of time for the quota window (e.g., seconds, minutes, hours). |