# Detect and Redact PII

Personal data like names, emails, phone numbers, and credit cards is caught and redacted before it ever reaches the AI model.

Personal data flowing through AI systems is a real risk. PII detection scans every message for names, email addresses, phone numbers, physical addresses, credit card numbers, social security numbers, and other sensitive identifiers. Detection happens before the message reaches the AI model, so personal data never enters the processing pipeline unprotected.

When PII is detected, the system responds based on your configured policy. In "redact" mode, it strips the PII and replaces it with a placeholder. In "warn" mode, it flags the content and asks you to confirm before proceeding. In "block" mode, it stops the message entirely. You choose the strictness level that matches your compliance requirements.

PII detection works on both input (messages you send to employees) and output (content employees generate). If an employee accidentally includes a customer's phone number in a marketing report, the output filter catches it. This bidirectional scanning means sensitive data does not slip through in either direction.

## Stop Personal Data Before It Reaches the AI Model

When an AI agent processes customer messages, documents, or database records, it will encounter personal information: names, email addresses, phone numbers, national IDs, credit card numbers, and more. PII Detection scans all incoming content before it reaches the language model, giving you control over what the AI actually sees.

You choose the response: redact the data automatically, warn the agent but allow it to proceed, or block the input entirely. This protects your customers' data and keeps your AI workforce compliant with privacy regulations like GDPR and CCPA without requiring manual review of every message.

## Configurable Detection Rules for Your Industry

Different industries carry different PII risk profiles. A healthcare agent working with patient records faces different sensitivity requirements than an e-commerce agent handling shipping addresses. PII Detection lets you configure which entity types are flagged, at what confidence threshold, and what action follows.

Built-in detectors cover common PII categories: contact information, financial identifiers, government IDs, health data, and location data. You can extend these with custom patterns using regular expressions or keyword lists, so industry-specific identifiers like policy numbers or patient codes are caught as well.

Detection runs server-side before the prompt is constructed, so PII is never inadvertently logged, cached, or sent to a third-party model. Redacted content is replaced with labeled tokens (like [EMAIL] or [PHONE]) that the agent can reference contextually without accessing the actual value.

## Audit Trails for Privacy Compliance

Every PII detection event is logged with the entity type detected, the action taken, and the timestamp. These logs are available in the monitoring dashboard and can be exported for compliance audits. You always have a record of what was detected and how it was handled.

For teams operating under strict data governance, this creates a verifiable boundary: the AI agent operated on anonymized data, and the system has proof. This is especially valuable when demonstrating compliance to customers, auditors, or regulators who need assurance that AI systems do not mishandle sensitive information.

## Use Cases

### Support team prevents AI from leaking customer data

When a customer sends a message containing a credit card number or SSN, the AI agent detects and redacts it before processing or logging the content.

### HR team protects employee data in AI workflows

Any AI agent handling HR documents automatically detects names, addresses, and ID numbers, redacting them before passing data to external tools.

### Legal team ensures AI outputs are privacy-safe

Before the AI employee sends any document externally, PII detection scans for personal data and flags or redacts it based on policy.

### Healthcare team keeps patient data out of AI context

Patient identifiers are detected and stripped from AI inputs automatically, keeping PHI out of model context and audit logs.

## Comparison

| Before | After |
|---|---|
| AI agents process and log personal data without any filter. | PII is detected and redacted automatically before the agent acts on it. |
| Compliance requires manual data review on every AI output. | Automated PII detection handles the review layer, no manual step needed. |
| A data breach from AI logging is discovered after the damage is done. | PII never enters logs or external calls, the risk is eliminated at source. |
| Teams build custom regex filters to catch personal data. | Built-in PII detection covers names, emails, IDs, cards, and more. |

## FAQ

### What types of PII are detected out of the box?

Built-in detectors cover names, email addresses, phone numbers, physical addresses, credit card numbers, social security numbers, passport numbers, IP addresses, and dates of birth. Custom patterns can be added for industry-specific identifiers.

### Does PII detection slow down the agent?

Detection runs in milliseconds server-side before prompt construction. For typical messages and documents, latency is negligible. For very large batch document processing, detection adds a small preprocessing step that is dwarfed by the model inference time.

### Can the agent still do its job after PII is redacted?

Yes. Redacted tokens like [EMAIL] or [CREDIT_CARD] preserve the structure of the message so the agent understands context without accessing the raw value. For most tasks, this is sufficient. For tasks that genuinely require the PII, you can configure a warning-only mode instead of hard redaction.

### Is this compliant with GDPR and CCPA?

PII Detection is a technical control that supports compliance, but compliance depends on your full data handling stack. The feature helps you demonstrate that AI processing is PII-aware and logged, which is a meaningful control for most regulatory frameworks.

### Does Sistava automatically detect and redact sensitive personal data?

Yes, the PII detection layer scans inputs and outputs and redacts personal data such as names, emails, and phone numbers before it is stored or sent anywhere. You can configure sensitivity levels to match your compliance requirements.

> We process thousands of support tickets a day. PII redaction runs automatically on every one before the agent sees it. Our DPO stopped worrying about AI compliance the day we turned it on.
> — Lena F., Head of Support · SaaS company