Back to Developer Roadmap

Prompt Hacking

src/data/roadmaps/ai-red-teaming/content/[email protected]

4.0759 B
Original Source

Prompt Hacking

Prompt hacking is a core technique for AI Red Teamers targeting LLMs. It involves crafting inputs (prompts) to manipulate the model into bypassing safety controls, revealing hidden information, or performing unintended actions. Red teamers systematically test various prompt hacking methods (like jailbreaking, role-playing, or instruction manipulation) to assess the LLM's resilience against adversarial user input.

Learn more from the following resources: