Skip to main content
Version: Next

DataSanitation Processor

Overview

The DataSanitation processor helps scrub sensitive information from incoming requests. It provides two approaches for configuration: either by specifying entities to inspect (whitelist-style) or entities to exclude (blacklist-style). This processor is ideal for ensuring that sensitive data such as credit card numbers, phone numbers, and email addresses are not passed downstream.

DataSanitation


Input and Output

This processor operates on the request stream:

  • Input Stream: Request – It intercepts and sanitizes the request body before forwarding it to downstream processors.
  • Output Stream: Request – After sanitization, the request continues through the processing pipeline.

This ensures sensitive fields are removed or masked before any further processing or logging occurs.


Parameters

blocklisted_entities

Type: list_of_strings
Required: False
Default: [CreditCard, Email, Phone]
A list of specific entity types to be scrubbed from the request. When this parameter is used, only the listed entities will be scrubbed (whitelist behavior).

Example:

- key: blocklisted_entities
value:
- CreditCard
- Email
- Phone

ignored_entities

Type: list_of_strings
Required: False
A list of entity types that should be excluded from scrubbing. When this parameter is used, all entities except the ones listed here will be scrubbed (blacklist behavior).

Example:

- key: ignored_entities
value:
- Phone

Best Practices

  • Use blocklisted_entities to explicitly control which fields get sanitized.
  • Use ignored_entities when you need to scrub everything except specific fields.
  • Always verify the output to ensure no sensitive data is leaked inadvertently.

DataSanitation Processor Template

DataSanitation Processor
SanitizeRequest:
processor: DataSanitation
parameters:
- key: blocklisted_entities
value:
- CreditCard
- Email
- Phone

Use Case

You can use the DataSanitation processor to sanitize OpenAI requests. In this configuration, all entities except IPAddress will be sanitized before the request reaches the OpenAI provider.

  OpenAISanitization:
processor: DataSanitation
parameters:
- key: ignored_entities
value:
- IPAddress