site/docs/red-team/strategies/meta.md
The Meta-Agent Jailbreaks strategy (jailbreak:meta) uses strategic decision-making to test your system's resilience against adaptive attacks.
Unlike standard iterative approaches that refine a single prompt, the meta-agent builds a custom taxonomy of approaches and adapts its strategy based on your target's responses.
Add it to your promptfooconfig.yaml:
strategies:
# Basic usage
- jailbreak:meta
# With configuration
- id: jailbreak:meta
config:
# Optional: Number of iterations to attempt (default: 10)
numIterations: 50
You can also override the number of iterations via an environment variable:
PROMPTFOO_NUM_JAILBREAK_ITERATIONS=5
:::info Cloud Required
This strategy requires Promptfoo Cloud to maintain persistent memory and strategic reasoning across iterations. Set PROMPTFOO_REMOTE_GENERATION_URL or log into Promptfoo Cloud.
:::
The meta-agent maintains memory across iterations to systematically explore different attack approaches. When one type of approach fails, it pivots to fundamentally different techniques rather than continuing to refine the same pattern.
This provides broader coverage of potential vulnerabilities at the cost of more API calls. Standard jailbreak refines a single approach repeatedly, while meta-agent explores multiple distinct approaches to find weaknesses.
| Aspect | Meta-Agent | Standard Iterative |
|---|---|---|
| Approach | Explores multiple distinct attack types | Refines variations of single approach |
| Coverage | Broad - tests different attack categories | Deep - exhausts one approach |
| Cost | Higher (more diverse attempts) | Lower (focused refinement) |
| Best For | Finding any vulnerability in robust systems | Testing specific attack patterns |
The meta-agent stops when it finds a vulnerability, determines the target is secure, or reaches max iterations.
Use jailbreak:meta when:
Use standard jailbreak when:
For a comprehensive overview of LLM vulnerabilities and red teaming strategies, visit our Types of LLM Vulnerabilities page.