Back to Crewai

PII Redaction for Traces

docs/en/enterprise/features/pii-trace-redactions.mdx

1.14.5a211.8 KB
Original Source

Overview

PII Redaction is a CrewAI AMP feature that automatically detects and masks Personally Identifiable Information (PII) in your crew and flow execution traces. This ensures sensitive data like credit card numbers, social security numbers, email addresses, and names are not exposed in your CrewAI AMP traces. You can also create custom recognizers to protect organization-specific data.

<Info> PII Redaction is available on the Enterprise plan. Deployment must be version 1.8.0 or higher. </Info> <Frame> ![PII Redaction Overview](/images/enterprise/pii_mask_recognizer_trace_example.png) </Frame>

Why PII Redaction Matters

When running AI agents in production, sensitive information often flows through your crews:

  • Customer data from CRM integrations
  • Financial information from payment processors
  • Personal details from form submissions
  • Internal employee data

Without proper redaction, this data appears in traces, making compliance with regulations like GDPR, HIPAA, and PCI-DSS challenging. PII Redaction solves this by automatically masking sensitive data before it's stored in traces.

How It Works

  1. Detect - Scan trace event data for known PII patterns
  2. Classify - Identify the type of sensitive data (credit card, SSN, email, etc.)
  3. Mask/Redact - Replace the sensitive data with masked values based on your configuration
Original: "Contact [email protected] or call 555-123-4567"
Redacted: "Contact <EMAIL_ADDRESS> or call <PHONE_NUMBER>"

Enabling PII Redaction

<Info> You must be on the Enterprise plan and your deployment must be version 1.8.0 or higher to use this feature. </Info> <Steps> <Step title="Navigate to Crew Settings"> In the CrewAI AMP dashboard, select your deployed crew and go to one of your deployments/automations, then navigate to **Settings** → **PII Protection**. </Step> <Step title="Enable PII Protection"> Toggle on **PII Redaction for Traces**. This will enable automatic scanning and redaction of trace data.
<Info>
  You need to manually enable PII Redaction for each deployment.
</Info>

<Frame>
  ![Enable PII Redaction](/images/enterprise/pii_mask_recognizer_enable.png)
</Frame>
</Step> <Step title="Configure Entity Types"> Select which types of PII to detect and redact. Each entity can be individually enabled or disabled.
<Frame>
  ![Configure Entities](/images/enterprise/pii_mask_recognizer_supported_entities.png)
</Frame>
</Step> <Step title="Save"> Save your configuration. PII redaction will be active on all subsequent crew executions, no redeployment is needed. </Step> </Steps>

Supported Entity Types

CrewAI supports the following PII entity types, organized by category.

Global Entities

EntityDescriptionExample
CREDIT_CARDCredit/debit card numbers"4111-1111-1111-1111"
CRYPTOCryptocurrency wallet addresses"bc1qxy2kgd..."
DATE_TIMEDates and times"January 15, 2024"
EMAIL_ADDRESSEmail addresses"[email protected]"
IBAN_CODEInternational bank account numbers"DE89 3704 0044 0532 0130 00"
IP_ADDRESSIPv4 and IPv6 addresses"192.168.1.1"
LOCATIONGeographic locations"New York City"
MEDICAL_LICENSEMedical license numbers"MD12345"
NRPNationalities, religious, or political groups-
PERSONPersonal names"John Doe"
PHONE_NUMBERPhone numbers in various formats"+1 (555) 123-4567"
URLWeb URLs"https://example.com"

US-Specific Entities

EntityDescriptionExample
US_BANK_NUMBERUS Bank account numbers"1234567890"
US_DRIVER_LICENSEUS Driver's license numbers"D1234567"
US_ITINIndividual Taxpayer ID"900-70-0000"
US_PASSPORTUS Passport numbers"123456789"
US_SSNSocial Security Numbers"123-45-6789"

Redaction Actions

For each enabled entity, you can configure how the data is redacted:

ActionDescriptionExample Output
maskReplace with the entity type label<CREDIT_CARD>
redactCompletely remove the text(empty)

Custom Recognizers

In addition to built-in entities, you can create custom recognizers to detect organization-specific PII patterns.

<Frame> ![Custom Recognizers](/images/enterprise/pii_mask_recognizer.png) </Frame>

Recognizer Types

You have two options for custom recognizers:

TypeBest ForExample Use Case
Pattern-based (Regex)Structured data with predictable formatsSalary amounts, employee IDs, project codes
Deny-listExact string matchesCompany names, internal codenames, specific terms

Creating a Custom Recognizer

<Steps> <Step title="Navigate to Custom Recognizers"> Go to your Organization **Settings** → **Organization** → **Add Recognizer**. </Step> <Step title="Configure the Recognizer"> <Frame> ![Configure Recognizer](/images/enterprise/pii_mask_recognizer_create.png) </Frame>
Configure the following fields:
- **Name**: A descriptive name for the recognizer
- **Entity Type**: The entity label that will appear in redacted output (e.g., `EMPLOYEE_ID`, `SALARY`)
- **Type**: Choose between Regex Pattern or Deny List
- **Pattern/Values**: Regex pattern or list of strings to match
- **Confidence Threshold**: Minimum score (0.0-1.0) required for a match to trigger redaction. Higher values (e.g., 0.8) reduce false positives but may miss some matches. Lower values (e.g., 0.5) catch more matches but may over-redact. Default is 0.8.
- **Context Words** (optional): Words that increase detection confidence when found nearby
</Step> <Step title="Save"> Save the recognizer. It will be available to enable on your deployments. </Step> </Steps>

Understanding Entity Types

The Entity Type determines how matched content appears in redacted traces:

Entity Type: SALARY
Pattern: salary:\s*\$\s*\d+
Input: "Employee salary: $50,000"
Output: "Employee <SALARY>"

Using Context Words

Context words improve accuracy by increasing confidence when specific terms appear near the matched pattern:

Context Words: "project", "code", "internal"
Entity Type: PROJECT_CODE
Pattern: PRJ-\d{4}

When "project" or "code" appears near "PRJ-1234", the recognizer has higher confidence it's a true match, reducing false positives.

Viewing Redacted Traces

Once PII redaction is enabled, your traces will show redacted values in place of sensitive data:

Task Output: "Customer <PERSON> placed order #12345.
Contact email: <EMAIL_ADDRESS>, phone: <PHONE_NUMBER>.
Payment processed for card ending in <CREDIT_CARD>."

Redacted values are clearly marked with angle brackets and the entity type label (e.g., <EMAIL_ADDRESS>), making it easy to understand what data was protected while still allowing you to debug and monitor crew behavior.

Best Practices

Performance Considerations

<Steps> <Step title="Enable Only Needed Entities"> Each enabled entity adds processing overhead. Only enable entities relevant to your data. </Step> <Step title="Use Specific Patterns"> For custom recognizers, use specific patterns to reduce false positives and improve performance. Regex patterns are best when identifying specific patterns in the traces such as salary, employee id, project code, etc. Deny-list recognizers are best when identifying exact strings in the traces such as company names, internal codenames, etc. </Step> <Step title="Leverage Context Words"> Context words improve accuracy by only triggering detection when surrounding text matches. </Step> </Steps>

Troubleshooting

<Accordion title="PII Not Being Redacted"> **Possible Causes:** - Entity type not enabled in configuration - Pattern doesn't match the data format - Custom recognizer has syntax errors

Solutions:

  • Verify entity is enabled in Settings → Security
  • Test regex patterns with sample data
  • Check logs for configuration errors </Accordion>
<Accordion title="Too Much Data Being Redacted"> **Possible Causes:** - Overly broad entity types enabled (e.g., `DATE_TIME` catches dates everywhere) - Custom recognizer patterns are too general

Solutions:

  • Disable entities that cause false positives
  • Make custom patterns more specific
  • Add context words to improve accuracy </Accordion>
<Accordion title="Performance Issues"> **Possible Causes:** - Too many entities enabled - NLP-based entities (`PERSON`, `LOCATION`, `NRP`) are computationally expensive as they use machine learning models

Solutions:

  • Only enable entities you actually need
  • Consider using pattern-based alternatives where possible
  • Monitor trace processing times in the dashboard </Accordion>

Practical Example: Salary Pattern Matching

This example demonstrates how to create a custom recognizer to detect and mask salary information in your traces.

Use Case

Your crew processes employee or financial data that includes salary information in formats like:

  • salary: $50,000
  • salary: $125,000.00
  • salary:$1,500.50

You want to automatically mask these values to protect sensitive compensation data.

Configuration

<Frame> ![Salary Recognizer Configuration](/images/enterprise/pii_mask_custom_recognizer_salary.png) </Frame>
FieldValue
NameSALARY
Entity TypeSALARY
TypeRegex Pattern
Regex Patternsalary:\s*\$\s*\d{1,3}(,\d{3})*(\.\d{2})?
ActionMask
Confidence Threshold0.8
Context Wordssalary, compensation, pay, wage, income

Regex Pattern Breakdown

Pattern ComponentMeaning
salary:Matches the literal text "salary:"
\s*Matches zero or more whitespace characters
\$Matches the dollar sign (escaped)
\s*Matches zero or more whitespace characters after $
\d{1,3}Matches 1-3 digits (e.g., "1", "50", "125")
(,\d{3})*Matches comma-separated thousands (e.g., ",000", ",500,000")
(\.\d{2})?Optionally matches cents (e.g., ".00", ".50")

Example Results

Original: "Employee record shows salary: $125,000.00 annually"
Redacted: "Employee record shows <SALARY> annually"

Original: "Base salary:$50,000 with bonus potential"
Redacted: "Base <SALARY> with bonus potential"
<Tip> Adding context words like "salary", "compensation", "pay", "wage", and "income" helps increase detection confidence when these terms appear near the matched pattern, reducing false positives. </Tip>

Enable the Recognizer for Your Deployments

<Warning> Creating a custom recognizer at the organization level does not automatically enable it for your deployments. You must manually enable each recognizer for every deployment where you want it applied. </Warning>

After creating your custom recognizer, enable it for each deployment:

<Steps> <Step title="Navigate to Your Deployment"> Go to your deployment/automation and open **Settings** → **PII Protection**. </Step> <Step title="Select Custom Recognizers"> Under **Mask Recognizers**, you'll see your organization-defined recognizers. Check the box next to the recognizers you want to enable.
<Frame>
  ![Enable Custom Recognizer](/images/enterprise/pii_mask_recognizers_options.png)
</Frame>
</Step> <Step title="Save Configuration"> Save your changes. The recognizer will be active on all subsequent executions for this deployment. </Step> </Steps> <Info> Repeat this process for each deployment where you need the custom recognizer. This gives you granular control over which recognizers are active in different environments (e.g., development vs. production). </Info>