Skip to main content
Version: Next

CountLLMTokens Processor

Overview​

The CountLLMTokens processor estimates the number of AI model tokens in an incoming request. This is particularly useful for tracking usage and cost of AI interactions in flows leveraging large language models (LLMs) such as OpenAI's GPT, Anthropic's Claude, or Google's Gemini.

This processor analyzes the request body, counts the tokens based on the configured model type and encoding, and stores the result in a custom request header for downstream processors or logging.

CountLLMTokens


Input and Output​

  • Input Stream: Request – Accepts the incoming request.
  • Output Stream: Request – Forwards the request downstream, optionally enriched with the token estimate.

Parameters​

store_count_header​

Type: string
Required: False
Default: x-lunar-estimated-tokens
Sets the custom header in which to store the calculated token count.

Example:

- key: store_count_header
value: "x-custom-token-count"

model_type​

Type: string
Required: False
Specifies the type of AI model. Supported values include ChatGPT, Claude, Gemini. This helps the processor choose the correct tokenization logic.

Example:

- key: model_type
value: "ChatGPT"

model​

Type: string
Required: False
Defines the specific model or wildcard pattern. Helps to distinguish among different versions (e.g., gpt-4, gpt-4o-*).

Example:

- key: model
value: "gpt-4o-*"

encoding​

Type: string
Required: False
Defines the type of encoding used for tokenization. Useful when the tokenizer requires an explicit encoding type.

Example:

- key: encoding
value: "cl100k_base"

Best Practices​

The encoding parameter can significantly impact token count accuracy. Use the correct tokenizer for your model (e.g., cl100k_base for GPT-4 or GPT-3.5). When in doubt, refer to the model provider’s tokenizer documentation.


CountLLMTokens Template​

CountTokens:
processor: CountLLMTokens
parameters:
- key: store_count_header
value: "x-lunar-estimated-tokens"
- key: model_type
value: "ChatGPT"
- key: model
value: "gpt-4o"
- key: encoding
value: "cl100k_base"

Use Case​

You can use the CountLLMTokens processor to:

  • Track token usage across API requests to AI models
  • Store token counts in logs for observability or cost reporting

This processor is used in flows that integrate with LLMs and need visibility into token-level usage for performance, pricing, or governance.


Flows Used In​

LLM Routing Flow