backend/docs/prompt_engineering_pentagi.md
A comprehensive framework for designing high-performance prompts within the PentAGI penetration testing system. This guide provides specialized principles for creating prompts that leverage the multi-agent architecture, memory systems, security tools, and specific operational context of PentAGI.
Model Processing Fundamentals
Priming and Contextual Influence
Clear Hierarchical Structure
#, ##, ###) for clear visual hierarchy and logical grouping of instructions. Ensure a logical flow from high-level role definition to specific protocols and requirements.CORE CAPABILITIES / KNOWLEDGE BASEOPERATIONAL ENVIRONMENT (including <container_constraints>)COMMAND & TOOL EXECUTION RULES (including <terminal_protocol>, <tool_usage_rules>)MEMORY SYSTEM INTEGRATION (including <memory_protocol>)TEAM COLLABORATION & DELEGATION (including <team_specialists>, <delegation_rules>)SUMMARIZATION AWARENESS PROTOCOL (including <summarized_content_handling>)EXECUTION CONTEXT (detailing use of {{.ExecutionContext}})COMPLETION REQUIREMENTSSemantic XML Delimiters
<container_constraints>, <terminal_protocol>, <memory_protocol>, <team_specialists>, <summarized_content_handling>) to logically group related instructions, especially for complex protocols and constraints requiring precise adherence by the LLM.<specialist> tags within <team_specialists>). Refer to existing templates like primary_agent.tmpl for examples.Context Window Optimization
Example Structure:
# [AGENT SPECIALIST TITLE]
[Role definition, primary objective, and security focus relevant to PentAGI]
## CORE CAPABILITIES / KNOWLEDGE BASE
[Agent-specific skills, knowledge areas relevant to PentAGI tasks]
## OPERATIONAL ENVIRONMENT
<container_constraints>...</container_constraints>
## COMMAND & TOOL EXECUTION RULES
<terminal_protocol>...</terminal_protocol>
<tool_usage_rules>...</tool_usage_rules>
## MEMORY SYSTEM INTEGRATION
<memory_protocol>...</memory_protocol>
## TEAM COLLABORATION & DELEGATION
<team_specialists>...</team_specialists>
<delegation_rules>...</delegation_rules>
## SUMMARIZATION AWARENESS PROTOCOL
<summarized_content_handling>...</summarized_content_handling>
## EXECUTION CONTEXT
[Explain how to use {{.ExecutionContext}} for Flow/Task/SubTask details]
## COMPLETION REQUIREMENTS
[Numbered list: Output format, final tool usage, language, reporting needs]
{{.ToolPlaceholder}}
Role-Based Customization
ai-concepts.mdc for role definitions.security-tools.mdc for Pentester; search strategies and tool priorities for Searcher).Security and Operational Boundaries
security-tools.mdc for general tool security context.<container_constraints>, populated by template variables like {{.DockerImage}}, {{.Cwd}}, {{.ContainerPorts}}. Specify restrictions clearly (e.g., "No direct host access," "No GUI applications," "No UDP scanning").SubTask. The agent must understand its current objective based on {{.ExecutionContext}} and not attempt actions related to other SubTasks or the overall Flow goal unless explicitly instructed within the current SubTask. Reference data-models.mdc and controller.md for task/subtask relationships.Ethical Boundaries and Safety
Agent Persistence Protocol
Planning and Reasoning
Chain-of-Thought Engineering
Error Handling and Adaptation
Metacognitive Processes
Memory Operations Protocol (<memory_protocol>)
ai-concepts.mdc (Memory section).{{.SearchGuideToolName}}, {{.SearchAnswerToolName}}) before performing external actions like web searches or running discovery tools.{{.StoreGuideToolName}}, {{.StoreAnswerToolName}}). Avoid cluttering memory with trivial or intermediate results.{{.ToolName}}) for memory interaction.Vector Database Awareness
Team Specialist Definition (<team_specialists>)
skills: Core competencies.use_cases: Specific situations or types of problems they should be delegated.tools: General categories of tools they utilize (not the specific invocation tool name).tool_name: The exact tool name variable (e.g., {{.SearchToolName}}, {{.PentesterToolName}}) used to invoke/delegate to this specialist.Delegation Rules (<delegation_rules>)
Terminal Command Protocol (<terminal_protocol>)
{{.DockerImage}}) and that the working directory ({{.Cwd}}) is NOT persistent between tool calls.cd /path/to/dir && command) within a single tool call if a specific path context is required for command.> file.log 2>&1) for potentially long-running commands.-y, --assume-yes, --non-interactive) where safe and appropriate to avoid hangs.detach mode if available/applicable for background tasks.Tool Definition and Invocation Best Practices
SearchGuide, not just Search)Search Tool Prioritization (<search_tools>)
searcher.tmpl for a good example matrix structure.browser tool only for accessing specific known URLs, not for general web searching," "Use tavily for in-depth technical research questions").Mandatory Result Delivery Tools
{{.HackResultToolName}} for Pentester, {{.SearchResultToolName}} for Searcher, {{.FinalyToolName}} for Orchestrator) that an agent MUST use to deliver its final output, report success/failure, and signify the completion of its current subtask.controller.md).Summarization Awareness Protocol (<summarized_content_handling>)
primary_agent.tmpl, pentester.tmpl, etc., MUST be included verbatim in all agent prompts.{{.SummarizationToolName}}, Prefixed Summary via {{.SummarizedContentPrefix}}).{{.SummarizedContentPrefix}}, or calling the {{.SummarizationToolName}} tool.Execution Context Awareness
{{.ExecutionContext}} variable.controller package (backend/docs/controller.md).Container Constraints (<container_constraints>)
{{.DockerImage}} (image name), {{.Cwd}} (working directory), {{.ContainerPorts}} (available ports).security-tools.mdc.Available Tools (<tools>)
pentester.tmpl and cross-check with security-tools.mdc.Example Selection and Structure
Example Implementation
Ambiguity Resolution Strategies
Conflict Resolution
Structured Tool Invocation is Mandatory
{{.ToolName}}).nmap -sV target.com") will not be executed by the system.Completion Requirements Section
## COMPLETION REQUIREMENTS) containing a numbered list of final instructions.{{.Lang}}).MUST use "{{.HackResultToolName}}" to deliver the final report).{{.ToolPlaceholder}} variable at the very end of the prompt. This allows the system backend to correctly inject tool definitions for the LLM.Modern LLM Instruction Following
Literal Adherence vs. Intent Inference
Essential Context Variables
{{.ExecutionContext}}: Critical. Provides structured details (IDs, status, titles, descriptions) about the current Flow, Task, and SubTask. Essential for scope and objective understanding.{{.Lang}}: Specifies the preferred language for agent responses and reports.{{.CurrentTime}}: Provides the execution timestamp for context.{{.DockerImage}}: Name of the Docker image the agent operates within.{{.Cwd}}: Default working directory inside the Docker container.{{.ContainerPorts}}: Available/mapped ports within the container environment.Standardized Tool Name Variables
{{.SearchToolName}}{{.PentesterToolName}}{{.CoderToolName}}{{.AdviceToolName}}{{.MemoristToolName}}{{.MaintenanceToolName}}{{.SearchGuideToolName}} (Retrieve Guide){{.StoreGuideToolName}} (Store Guide){{.SearchAnswerToolName}} (Retrieve Answer/General){{.StoreAnswerToolName}} (Store Answer/General){{.SearchCodeToolName}} (Likely needed) (Retrieve Code Snippet){{.StoreCodeToolName}} (Likely needed) (Store Code Snippet){{.HackResultToolName}} (Pentester Final Report){{.SearchResultToolName}} (Searcher Final Report){{.FinalyToolName}} (Orchestrator Subtask Completion Report){{.SummarizationToolName}} (System Use Only - Marker for historical summaries){{.TerminalToolName}} (Assumed name for terminal function){{.FileToolName}} (Assumed name for file operations function){{.BrowserToolName}} (Assumed name for browser/scraping function)Effective Patterns
Common Anti-Patterns
Systematic Diagnosis
Improvement Metrics
Text-Visual Integration
TEAM CAPABILITIES, OPERATIONAL PROTOCOLS (esp. Task Analysis, Boundaries, Delegation Efficiency), DELEGATION PROTOCOL, SUMMARIZATION AWARENESS PROTOCOL, COMPLETION REQUIREMENTS (using {{.FinalyToolName}}).{{.FinalyToolName}}.nmap, sqlmap, etc.), vulnerability exploitation, evidence collection and documentation.KNOWLEDGE MANAGEMENT (Memory Protocol), OPERATIONAL ENVIRONMENT (Container Constraints), COMMAND EXECUTION RULES (Terminal Protocol), PENETRATION TESTING TOOLS (list available), TEAM COLLABORATION, DELEGATION PROTOCOL, SUMMARIZATION AWARENESS PROTOCOL, COMPLETION REQUIREMENTS (using {{.HackResultToolName}}).{{.HackResultToolName}}.CORE CAPABILITIES (Action Economy, Search Optimization), SEARCH TOOL DEPLOYMENT MATRIX, OPERATIONAL PROTOCOLS (Search Efficiency, Query Engineering), SUMMARIZATION AWARENESS PROTOCOL, SEARCH RESULT DELIVERY (using {{.SearchResultToolName}}).{{.SearchAnswerToolName}}), strictly limit the number of search actions, use the right tool for the query complexity (Matrix), stop searching once sufficient information is gathered, deliver concise yet comprehensive synthesized results via {{.SearchResultToolName}}.(Guidelines for Developer, Adviser, Memorist, Installer agents should be developed following this structure, focusing on their unique roles, tools, and interactions based on their specific implementations and prompt templates).
backend/pkg/templates/prompts/ directory.<agent_role>[_optional_specifier].tmpl.{{.Variable}}), and the general input/output behavior for each prompt template. Ensure this documentation stays synchronized with the backend code that populates the variables.ctester utility (backend/cmd/ctester/) for validating LLM provider compatibility and basic prompt adherence (e.g., JSON formatting, function calling capabilities) for different agent types. Reference development-workflow.mdc / README.md.ftester utility (backend/cmd/ftester/) for in-depth testing of specific agent functions and prompt behaviors within realistic contexts (Flow/Task/SubTask). This is crucial for debugging complex interactions and prompt logic.ctester, ftester, and Langfuse analysis. Test changes thoroughly before deployment.(Refer to the actual, up-to-date files in backend/pkg/templates/prompts/ such as primary_agent.tmpl, pentester.tmpl, and searcher.tmpl for concrete implementation patterns that follow these guidelines.)