plugins/ruflo-aidefence/skills/safety-scan/SKILL.md
Scan content for prompt injection, jailbreak attempts, and unsafe patterns.
Before processing untrusted input (user submissions, API payloads, webhook data), scan it to detect prompt injection, adversarial content, or policy violations.
mcp__claude-flow__aidefence_is_safe with the input text for a boolean safe/unsafe resultmcp__claude-flow__aidefence_analyze for detailed threat classification and confidence scoresmcp__claude-flow__aidefence_scan for comprehensive multi-layer scanningmcp__claude-flow__aidefence_learn with confirmed threats to improve detectionmcp__claude-flow__aidefence_stats for detection rates and false positive metrics