Direct identifiers
Full names, emails, phone numbers, postal addresses, and government IDs that point to one person.
How-to — — by Mahmoud Zalt
Redact PII before any public LLM call by masking names, emails, IDs, and payment data at the boundary, then restoring the values locally after the model responds.
Every prompt your AI Employee sends to a hosted model (OpenAI, Anthropic, Google, OpenRouter) leaves your perimeter and lands in a vendor's logging pipeline. Most providers promise short retention windows and no training on API data, but promises are not the same as architecture. If a customer email, a payment reference, or a private health note ends up in a prompt, you have just exported regulated data to a third party, and your GDPR record-of-processing now needs a new row. Redaction at the boundary fixes that by guaranteeing the model only sees opaque tokens. The AI Employee still reasons about the structure of the conversation (this is a refund request from a customer with two prior tickets), but the public LLM never sees the customer's name, their email, or the order ID. The risk reduction is huge for almost zero quality loss, because modern models reason just as well over tokens as over raw strings.
PII is broader than most teams assume when they first audit an AI Employee. The obvious bucket is direct identifiers: full names, email addresses, phone numbers, government IDs, and home addresses. The second bucket, which trips up most product teams, is indirect identifiers: customer IDs, order numbers, internal user UUIDs, IP addresses, device fingerprints, and any free-text note that could re-identify a person when combined with another record. The third bucket is sensitive PII: payment data, health information, religion, sexual orientation, and political views, which carry extra obligations under GDPR Article 9 and the CCPA. A good redaction layer recognises all three buckets and treats them with different swap strategies, because a customer ID needs a consistent token across a session while a credit card number should never be stored anywhere downstream.
Full names, emails, phone numbers, postal addresses, and government IDs that point to one person.
Customer IDs, order numbers, UUIDs, IP addresses, and device IDs that re-identify when combined.
Card numbers, IBANs, CVVs, and Stripe customer IDs that fall under PCI scope and should never reach a public LLM.
Diagnoses, medication, religion, and other GDPR Article 9 fields that need explicit lawful basis to process.
Customer notes, support transcripts, and meeting summaries where PII hides inside prose, not structured fields.
The pattern that actually works in production is a five-stage pipeline that sits between your AI Employee and the public LLM. It runs on every outbound call, takes a few milliseconds, and is invisible to the user. The same pipeline runs in reverse on the response, so the AI Employee can show the user a complete answer with real names and IDs without the public LLM ever holding them. The key design choice is determinism within a session: the same email should always map to the same token, so the model can reason about repeated mentions. Across sessions the tokens reset, so a leaked mapping does not unlock historical conversations. If you build this once, every employee, channel, and tool benefits at the same time.
Two implementation details matter more than people expect. First, the vault must be ephemeral and tenant-scoped. A leaked token mapping is only useful if you can pair it with a vault, so the vault should live in memory for the duration of one conversation, never on disk, and never cross tenant boundaries. Second, you need a strict allow-list of which fields the AI Employee is allowed to read in the first place. Redaction is the last line of defence, not the first. If a sales tool dumps the entire customer record into the prompt, you are redacting a thousand fields when you only needed three.
Once the pipeline is running, the next thing that bites you is policy. Different roles need different redaction strictness. A finance AI Employee handling refunds genuinely needs to see the last four digits of a card to confirm with the customer, while a marketing copywriter has no business seeing any payment data at all. The cleanest way to model this is per-role redaction policies, applied on top of the global pipeline, so each AI Employee gets exactly the visibility its job requires and nothing more.
Four credible options exist for teams who do not want to build the detector from scratch. Microsoft Presidio is the open-source benchmark, free, well-maintained, and battle-tested for the common PII categories with both regex and NER detectors. Google Cloud DLP and AWS Comprehend offer hosted equivalents with strong detection accuracy and per-call billing that adds up fast at high volume. Skyflow and Nightfall sell the redaction-as-a-service shape with vaulting included, which removes the in-memory store concern but adds vendor risk. Inside Sistava we use a tuned Presidio pipeline plus a custom classifier for product-specific fields, because Presidio handles the universal categories well but every business has its own ID format that the open detector misses on day one. The honest take: start with Presidio, measure recall on your real prompts, then decide whether to layer a hosted service or stay self-hosted.
Free, self-hosted, strong on universal categories, customisable with your own recognisers. Best starting point.
Hosted, accurate, pay-per-call. Adds vendor dependency but offloads detector maintenance.
Redaction plus vaulting as a managed service. Removes in-memory vault risk at the cost of a second perimeter.
A small model trained on your product-specific IDs catches what universal detectors miss. Mandatory in practice.
Redaction is not free, and pretending it is leads to bad output. Three failure modes show up in practice. First, over-tokenisation: if you mask too many fields, the model loses context and starts producing generic, unhelpful answers. The fix is to mask only what is actually PII, not every string that looks suspicious. Second, broken cross-references: when [PERSON_1] and [PERSON_2] are mentioned, the model needs to know they are different people, and a sloppy tokenizer that reuses tokens for similar names destroys that signal. The fix is deterministic, content-hashed token assignment per session. Third, formatting drift: if the model writes a polite reply addressed to [PERSON_1] and your rehydration step misses the token, your customer receives an email addressed to a bracketed placeholder, which is worse than no redaction at all. The fix is a strict post-call validator that fails closed if any token survives the response.
No. Providers offer retention and training-opt-out policies, but they do not strip PII from your prompts. Anything you send arrives intact, gets logged on their side, and may be available to support staff or subpoenaed. Redaction is your job, not theirs.
Not safely. Prompt instructions reduce the chance of the model echoing PII in the response, but they do nothing about the data sitting in the request payload, in vendor logs, and in any cache. Redaction is a data-handling control, not a behavioural one.
Redaction is a strong technical safeguard, but compliance also requires a lawful basis, a record of processing, a data processing agreement with each LLM vendor, and a clear policy on retention. Redaction makes the picture much better, especially for special category data, but it is one control among several.
Same pipeline, different entry point. Voice transcripts go through redaction before they hit the LLM, and uploaded documents (PDFs, spreadsheets, emails) get parsed, redacted, then summarised. The detector needs OCR and table handling for documents, but the five-stage pattern is identical.
Yes. The redaction pipeline runs on every outbound LLM call across every employee, channel, and tool. You can tighten it per role (stricter for marketing, looser for support agents who need names) but the default is conservative: mask first, expose deliberately.
If you are still picking a platform and want to understand the broader safety story (where the data lives, who can access it, how tenant isolation works, what happens if a model misbehaves), the next read covers the full picture beyond just the redaction layer. It is the companion to this how-to and answers the questions security reviewers ask before they sign off on an AI Employee in a real business.
The honest framing for this whole topic: PII redaction is the cheapest large safety win you can ship for an AI Employee. The pipeline is well understood, the open-source detector gets you 80% of the way in an afternoon, and the in-memory vault is a small lift compared to the regulatory exposure of sending raw customer data to a third-party LLM. Build it once at the agent boundary, apply it consistently across every employee, channel, and tool, and measure recall against real prompts so you know where the detector misses. Almost everything else about responsible AI deployment becomes easier once the public LLM stops seeing real names, real emails, and real payment data. Start with Presidio, add a small custom classifier for your product-specific IDs, and you will be ahead of most production AI deployments running today.