# Built-in Content Safety

Built-in content filters block harmful or off-brand output, and custom rules let you enforce your own standards.

Content safety policies evaluate everything your employees say and produce. Built-in filters powered by NVIDIA NeMo detect harmful, inappropriate, or unsafe content and block it before it reaches you or your customers. If an employee generates something that violates safety standards, the policy catches it and prevents delivery.

Custom rules layer on top of the built-in safety. Add your own policies for brand voice, legal compliance, industry-specific requirements, or any other standard your content must meet. A healthcare company might add a policy that flags unverified medical claims. An e-commerce company might add one that prevents pricing below cost. Custom rules are written in plain language and enforced automatically.

Policies evaluate both sides of the conversation: input (what users send to employees) and output (what employees generate). This bidirectional enforcement means your employees neither receive nor produce content that violates your standards. Violations are logged, so you can review patterns and adjust policies over time.

## Built-In Content Safety With Industrial-Grade Filtering

Sistava ships with content safety filtering powered by NVIDIA NeMo Guardrails, one of the most widely used open frameworks for guiding language-model behavior. This handles the baseline: blocking harmful, abusive, or off-topic outputs before they reach users or downstream systems.

Out of the box, filters cover toxicity, hate speech, prompt injection attempts, and off-topic deflection. These run at the model output layer, so even if an external input attempts to manipulate the agent, the response is filtered before it leaves the system. Your AI employees do not produce content you would be embarrassed to put your name on.

## Custom Policies for Your Brand, Legal Team, and Industry

Every organization has content requirements that generic filters do not cover. A financial services firm cannot have agents giving investment advice. A healthcare company needs agents to avoid diagnostic claims. A brand may have a list of competitor names or product claims that must never appear in agent outputs. Custom Content Policies let you define these rules precisely.

Policies are defined as rules: if output contains X, do Y. Y can be block and replace with a safe fallback, flag for review, or log silently. Rules support keyword lists, regex patterns, topic classifiers, and semantic similarity checks for nuanced restrictions that keyword matching alone would miss.

Custom policies stack on top of the built-in NeMo filters, not instead of them. You always get the baseline safety layer plus whatever is specific to your organization. Changes to policies apply to all current and future AI employees in your workspace.

## Content Safety Across the Entire AI Workforce

In a multi-agent environment, content safety cannot be agent-specific. When agents collaborate, delegate tasks, or pass outputs to one another, a gap in one agent's policy is a gap in the whole system. Content Policies apply at the platform level, covering every AI employee regardless of role or configuration.

This is especially important for customer-facing agents where brand consistency and legal compliance are non-negotiable. A sales agent, a support agent, and an onboarding agent all operate under the same content rules, so your customers get a consistent and safe experience regardless of which agent they interact with.

## Use Cases

### Brand team enforces tone and messaging standards

Content policies block the AI employee from producing outputs that violate brand guidelines, ensuring every piece of content stays on-brand.

### Platform team prevents misuse of AI capabilities

Policies define what topics, actions, and output types are off-limits, so the AI agent cannot be directed to produce harmful or inappropriate content.

### Enterprise team meets regulatory content requirements

Industry-specific content rules are encoded as policies, and the AI employee enforces them automatically on every output it generates.

### Education platform keeps AI outputs age-appropriate

Content safety policies filter AI agent outputs based on audience, ensuring all generated content meets the platform's standards for students.

## Comparison

| Before | After |
|---|---|
| AI agents can generate anything, policy is manual and reactive. | Content policies are enforced automatically on every output. |
| Brand voice violations require human review of every piece. | Policies catch violations before content leaves the agent. |
| Compliance with content regulations is a manual process. | Regulatory content rules are encoded once and applied everywhere. |
| Misuse of AI capabilities is discovered after the fact. | Policies prevent out-of-bounds content at the point of generation. |

## FAQ

### What is NVIDIA NeMo Guardrails and why does it matter?

NeMo Guardrails is an open-source framework developed by NVIDIA for adding programmable safety and topical constraints to LLM applications. It is one of the most widely adopted tools in production AI systems for its reliability and flexibility. We use it as the foundational layer so you benefit from battle-tested safety infrastructure.

### Can I prevent agents from discussing specific topics?

Yes. Topic-based restrictions are a first-class policy type. You can define off-limits topics using keywords, semantic classifiers, or both, and choose whether the agent deflects politely or simply declines to engage. This is useful for legal, competitive, or regulatory reasons.

### Do content policies apply to agent-to-agent communication?

Yes. Policies apply to all outputs from AI employees, including messages passed between agents in multi-agent workflows. This closes the loophole where internal agent communication could bypass user-facing filters.

### How quickly do policy changes take effect?

Policy changes apply immediately to new conversations. Ongoing conversations pick up the new policy at the next agent response. There is no deployment step or container restart required.

### Can I block my AI agent from producing certain types of content?

Yes, Sistava includes built-in content safety powered by NeMo Guardrails, and you can add custom rules on top to block specific topics, outputs, or behaviors. Policies are enforced at runtime before any response is delivered.

> We have strict rules about what our agents can and cannot say to clients. Custom content policies enforce all of them without us reviewing every response.
> — Mei C., Legal Operations Manager · regulated industry