Version: Next

CountLLMTokens Processor

Overview

The CountLLMTokens processor estimates the number of AI model tokens in an incoming request. This is particularly useful for tracking usage and cost of AI interactions in flows leveraging large language models (LLMs) such as OpenAI's GPT, Anthropic's Claude, or Google's Gemini.

This processor analyzes the request body, counts the tokens based on the configured model type and encoding, and stores the result in a custom request header for downstream processors or logging.

CountLLMTokens

Input and Output

Input Stream: Request – Accepts the incoming request.
Output Stream: Request – Forwards the request downstream, optionally enriched with the token estimate.

Parameters

store_count_header

Type: string
Required: False
Default: x-lunar-estimated-tokens
Sets the custom header in which to store the calculated token count.

Example:

- key: store_count_header
  value: "x-custom-token-count"

model_type

Type: string
Required: False
Specifies the type of AI model. Supported values include ChatGPT, Claude, Gemini. This helps the processor choose the correct tokenization logic.

Example:

- key: model_type
  value: "ChatGPT"

model

Type: string
Required: False
Defines the specific model or wildcard pattern. Helps to distinguish among different versions (e.g., gpt-4, gpt-4o-*).

Example:

- key: model
  value: "gpt-4o-*"

encoding

Type: string
Required: False
Defines the type of encoding used for tokenization. Useful when the tokenizer requires an explicit encoding type.

Example:

- key: encoding
  value: "cl100k_base"

Best Practices

The encoding parameter can significantly impact token count accuracy. Use the correct tokenizer for your model (e.g., cl100k_base for GPT-4 or GPT-3.5). When in doubt, refer to the model provider’s tokenizer documentation.

CountLLMTokens Template

CountTokens:
  processor: CountLLMTokens
  parameters:
    - key: store_count_header
      value: "x-lunar-estimated-tokens"
    - key: model_type
      value: "ChatGPT"
    - key: model
      value: "gpt-4o"
    - key: encoding
      value: "cl100k_base"

Use Case

You can use the CountLLMTokens processor to:

Track token usage across API requests to AI models
Store token counts in logs for observability or cost reporting

This processor is used in flows that integrate with LLMs and need visibility into token-level usage for performance, pricing, or governance.

Flows Used In

LLM Routing Flow

Overview​

Input and Output​

Parameters​

store_count_header​

model_type​

model​

encoding​

Best Practices​

CountLLMTokens Template​

Use Case​

Flows Used In​

Overview

Input and Output

Parameters

store_count_header

model_type

model

encoding

Best Practices

CountLLMTokens Template

Use Case

Flows Used In