documentation/blog/2025-07-28-streamlining-detection-development-with-goose-recipes/index.md
Creating effective security detections in Panther traditionally requires deep knowledge of detection logic, testing frameworks, and development workflows. The detection engineering team at Block has streamlined this process by building Goose recipes that automate the entire detection creation lifecycle from initial repository set up to pull request creation.
This blog post explores how to leverage Goose's recipe and subrecipe system to create new detections in Panther with minimal manual intervention, ensuring consistency, quality, and adherence to team standards.
<!-- truncate -->Recipes are reusable, shareable configurations that package up a complete setup for a specific task. These standalone files can be used in automation workflows that orchestrate complex tasks by breaking them into manageable, specialized components. Think of them as sophisticated CI/CD pipelines for AI-assisted development, where each step has clearly defined inputs, outputs, and responsibilities.
Two notable ingredients of recipes are instructions and prompt. In short:
Use the following context:
SCOPE BOUNDARIES:
prompt: |
AGENTS.md - ...
</details>
The detection creation recipe demonstrates the power of this approach by coordinating six specialized subrecipes, each handling a specific aspect of detection development:
1. [**workflow_setup**](#1-workflow_setup-foundation-first) - Repository preparation and environment validation
2. [**similar_rule_analyzer**](#2-similar_rule_analyzer-learning-from-existing-patterns) - Finding and analyzing existing detection patterns
3. [**schema_and_sample_events_analyzer**](#3-schema_and_sample_events_analyzer-data-driven-detection-logic) - Analyzing log schemas and performing sample data collection
4. [**rule_creator**](#4-rule_creator-the-implementation-engine) - Actual detection rule implementation
5. [**testing_validator**](#5-testing_validator-quality-assurance) - Comprehensive test execution and validation
6. [**pr_creator**](#6-pr_creator-automated-pull-request-pipeline) - Pull request creation with proper formatting
### What about .goosehints?
In our [previous post](https://goose-docs.ai/blog/2025/06/02/goose-panther-mcp), we discussed using [.goosehints](/docs/guides/context-engineering/using-goosehints/) to provide persistent context to the Large Language Model (LLM). We continue to use `.goosehints` to define coding standards and universal preferences that guide LLM behavior.
However, to minimize redundancy and avoid conflicting guidance, we adopted a single reference file, `AGENTS.md`, as the source of truth for all agents. Each agent is directed to consult this file, while still supporting agent-specific instructions through their default context files (e.g. `.goosehints`, `CLAUDE.md` etc.) or rules (e.g. `.cursor/rules/`).
While these context files are important, they also come with some trade offs and limitations:
| Aspect | Context Files | Recipes |
|--------|---------------|---------|
| **Context window pollution** | The entire file is sent with each request, cluttering the context window | Only task-relevant instructions, keeping prompts clear and focused |
| **Signal-to-noise ratio** | General preferences dilute focus and may create conflicting guidance | Every instruction is workflow-specific, eliminating noise |
| **Cost and performance impact** | May lead to higher token costs and slower processing from unnecessary context | Pay only for relevant tokens with faster response times |
| **Cognitive load on the AI** | Conflicting instructions cause decision paralysis | Clear, unified guidance enables decisive action |
| **Task-specific optimization** | Generic instructions lack specialized tools and parameters | Purpose-built with pre-configured tools for specific workflows |
This centralized approach through `AGENTS.md` becomes the foundation for our recipe architecture, which we'll explore next.
## The Architecture
### Design Principles
1. **Single Responsibility**: Each subrecipe has one clear job
2. **Explicit Data Flow**: No hidden state or implicit dependencies
3. **Fail-Fast**: Stop immediately when critical steps fail
4. **Graceful Degradation**: Continue with reduced functionality when possible
5. **Comprehensive Testing**: Validate everything before deployment
### Why Subrecipes Matter
The traditional approach to AI-assisted detection creation often involves a single, monolithic prompt (AKA “single-shot prompting”) that tries to handle everything at once. This leads to several problems:
- **Context confusion**: The AI loses focus when juggling multiple responsibilities
- **Inconsistent outputs**: Without clear boundaries, results vary significantly (e.g. one subrecipe may try to complete the task that we're expecting another subrecipe to accomplish)
- **Difficult debugging**: When something fails, it's hard to identify the specific issue
- **Poor maintainability**: Changes to one aspect affect the entire workflow
The subrecipe architecture solves these problems through strict separation of concerns, setting boundaries and providing exit criteria.
Each subrecipe operates in isolation with:
- Clearly defined inputs and outputs
- Specific scope boundaries (what it MUST and MUST NOT do)
- Standardized JSON response schemas
- Formal error handling patterns
At a high level, a (non-parallel) version would look like:
| Step | Component | Type | Description |
|------|-----------|------|-------------|
| **1** | [`workflow_setup`](#1-workflow_setup-foundation-first) | Required | Initialize workflow environment |
| **2** | [`similar_rule_analyzer`](#2-similar_rule_analyzer-learning-from-existing-patterns) | *Conditional* | Analyze existing similar rules |
| **3** | [`schema_and_sample_events_analyzer`](#3-schema_and_sample_events_analyzer-data-driven-detection-logic) | *Conditional* | Process schema and sample data |
| **4** | [`rule_creator`](#4-rule_creator-the-implementation-engine) | Required | Generate the detection rule |
| **5** | [`testing_validator`](#5-testing_validator-quality-assurance) | Required | Validate and test the rule |
| **6** | [`pr_creator`](#6-pr_creator-automated-pull-request-pipeline) | *Conditional* | Create pull request |
> 💡 **Note:** *Conditional* steps may be skipped based on workflow configuration
<details>
<summary>
Workflow visualized
</summary>

</details>
## Data Flow and State Management
Since subrecipes currently run in isolation, data must be explicitly passed between them. The main recipe orchestrates this flow:
Example of how this would be defined in the recipe:
workflow_setup(rule_description) → Returns:
AGENTS.md
And an example of how the data would flow:
workflow_setup(rule_description) → { branch_name: "ai/aws-privilege-escalation", standards_summary: "Key requirements from AGENTS.md...", repo_ready: true, mcp_panther: { access_test_successful: true } }
similar_rule_analyzer(rule_description, standards_summary) → { similar_rules_found: [...], rule_analysis: "Analysis of existing patterns...", suggested_approach: "Create new rule with modifications..." }
This explicit data passing ensures:
- **Predictable behavior** across runs
- **Easy debugging** when issues occur
- **Clear audit trails** of what data influenced each decision
- **Modular testing** of individual components
## Smart Optimizations: Conditional Execution
One of the most powerful features of the detection creation workflow is its intelligent optimization system that skips unnecessary steps based on both parameters and runtime conditions.
### Parameter-Based Conditions
Users can control workflow behavior through parameters:
```shell
# Fast mode - skip similar rule analysis
goose run --recipe recipe.yaml --params skip_similar_rules_check=true --rule_description="What you want to detect"
# Skip Panther MCP integration
goose run --recipe recipe.yaml --params skip_panther_mcp=true --rule_description="What you want to detect"
# Create PR automatically
goose run --recipe recipe.yaml --params create_pr=true --rule_description="What you want to detect"
The workflow makes intelligent decisions based on results from previous steps:
# Current implementation uses both parameter-based and runtime conditions
# Parameter-based (available at recipe start):
- skip_similar_rules_check: Controls similar_rule_analyzer execution
- skip_panther_mcp: Controls schema_and_sample_events_analyzer execution
- create_pr: Controls pr_creator execution
# Runtime conditions (based on subrecipe results):
- schema_and_sample_events_analyzer runs only if:
* skip_panther_mcp is false AND
* (similar_rules_found is empty OR mcp_panther.access_test_successful is false)
This hybrid approach provides:
Additionally, Jinja support enables the codification of event triggers, ensuring the agent adheres to predefined instructions rather than making independent, potentially incorrect, decisions. For instance, the agent can be directed to bypass a step, depending on a parameter's value:
{% if create_pr %}
6. `pr_creator(rule_files_created, rule_description, branch_name, create_pr={{ create_pr }}, panther_mcp_usage)` → Returns:
{
"success": true,
"data": {
"pr_created": true,
"pr_url": "https://github.com/<org>/<team>-panther-content/pull/123",
"pr_number": 123,
"summary": "Summary of the completed work"
}
}
{% else %}
6. **SKIPPED** `pr_creator` - create_pr parameter is false
- Provide final summary of completed work instead
{% endif %}
workflow_setup: Foundation First| Input | Output |
|---|---|
rule_description | branch_name, standards_summary, repo_ready, mcp_panther |
This subrecipe handles all the foundational work:
Key responsibilities:
AGENTS.mdOutput example:
{
"status": { "success": true },
"data": {
"branch_name": "ai/okta-suspicious-login",
"standards_summary": "Rules must use ai_ prefix, implement required functions...",
"repo_ready": true,
"mcp_panther": { "access_test_successful": true }
}
}
similar_rule_analyzer: Learning from Existing Patterns| Input | Output |
|---|---|
rule_description, standards_summary, rule_type | similar_rules_found, rule_analysis, suggested_approach |
This subrecipe searches the repository for similar detection patterns:
# Search strategy by rule type:
- streaming rules: Search rules/<team>_rules/
- correlation rules: Search correlation_rules/<team>_correlation_rules/
- scheduled rules: Search queries/<team>_queries/
Key responsibilities:
Key insight: It prioritizes team-created rules (<team>_* directories) over upstream rules, ensuring consistency with established patterns.
Even without direct access to the detection engine, users can develop new detections by leveraging existing ones, along with our established standards and test suite.
schema_and_sample_events_analyzer: Data-Driven Detection Logic| Input | Output |
|---|---|
rule_description, similar_rules_found | log_schemas, example_logs, field_mapping, panther_mcp_usage |
This subrecipe bridges the gap between detection requirements and implementation by leveraging Panther's MCP integration:
Key responsibilities:
Smart data collection strategy:
Output example:
{
"status": { "success": true },
"data": {
"log_schemas": [{
"log_type": "AWS.CloudTrail",
"schema_summary": "Contains eventName, sourceIPAddress, userIdentity fields",
"relevance": "Essential for detecting privilege escalation patterns"
}],
"example_logs": [{
"log_type": "AWS.CloudTrail",
"event_summary": "AssumeRole events with cross-account access",
"key_fields": ["eventName", "sourceIPAddress", "userIdentity.type"]
}],
"panther_mcp_usage": {
"mcp_used": true,
"log_schemas_referenced": true,
"data_lake_queries_performed": true
}
}
}
Fallback handling: When Panther MCP is unavailable, it intelligently uses similar rule analysis to infer schema structure, ensuring the workflow continues with reduced but functional capability.
rule_creator: The Implementation Engine| Input | Output |
|---|---|
rule_description, similar_rules_found, rule_analysis, log_schemas, standards_summary | rule_files_created, rule_implementation, test_cases_created |
This is where the magic happens - this subrecipe generates the required files containing the detection logic, metadata and unit tests.
Smart log source validation:
Example key principles:
any() and all() over multiple return statementsTo illustrate, the following example provides guidance for the last bullet point:
💡 Code Quality Tip: Simplify Conditional Logic
❌ Avoid: Too Many Return Statements
python# multiple returns make logic hard to follow def rule(event) -> bool: if event.deep_get("eventType", default="") != "user.session.start": return False if event.deep_get("outcome", "result", default="") != "SUCCESS": return False if event.deep_get("actor", "alternateId", default="").lower() == TARGET_USER.lower(): return True return False✅ Preferred: Clear Structure with
any()andall()pythondef rule(event) -> bool: return all([ event.deep_get("eventType", default="") == "user.session.start", event.deep_get("outcome", "result", default="") == "SUCCESS", event.deep_get("actor", "alternateId", default="").lower() == TARGET_USER.lower() ])
testing_validator: Quality Assurance| Input | Output |
|---|---|
rule_files_created | test_results, validation_status, issues_found |
This subrecipe serves as the critical quality gate, executing the mandatory testing pipeline that ensures every detection meets production standards.
Key responsibilities:
AGENTS.md (e.g. linting, formatting and both unit and pytests)These checks ensure detections meet our standards, preventing subpar code from being merged. Should a check fail, the LLM will iterate, identifying and implementing necessary changes until compliance is achieved as part of the same recipe run.
Intelligent failure analysis: The subrecipe doesn't just run tests - it analyzes failures and provides specific guidance:
{
"test_results": {
"tests_passed": 3,
"tests_failed": 1,
"test_details": [{
"test_name": "make lint",
"status": "failed",
"message": "pylint: missing default value in deep_get() call"
}]
},
"recommendations": [
"Add default values to all deep_get() calls per AGENTS.md standards",
"Reference 'Core Coding Standards' section for proper error handling"
]
}
Output example:
{
"status": { "success": true },
"data": {
"test_results": {
"tests_passed": 4,
"tests_failed": 0,
"test_details": [
{ "test_name": "make fmt", "status": "passed", "message": "All files formatted correctly" },
{ "test_name": "make lint", "status": "passed", "message": "No linting issues found" },
{ "test_name": "make test", "status": "passed", "message": "Rule tests passed: 2/2" },
{ "test_name": "make pytest-all", "status": "passed", "message": "All unit tests passed" }
]
},
"validation_summary": "All mandatory tests passed. Rule ready for PR creation.",
"recommendations": []
}
}
pr_creator: Automated Pull Request Pipeline| Input | Output |
|---|---|
rule_files_created, rule_description, branch_name, create_pr, panther_mcp_usage | pr_created, pr_url, pr_number, summary |
This subrecipe handles the final workflow step with full adherence to team standards:
Key responsibilities:
Intelligent PR creation:
create_pr=true, otherwise provides summaryAGENTS.md standardsGit operation standards:
--no-verify flags - fixes issues rather than bypassing themOutput example:
{
"status": { "success": true },
"data": {
"pr_url": "https://github.com/<org>/<team>-panther-content/pull/123",
"pr_number": 123,
"branch_name": "ai/aws-privilege-escalation",
"commit_hash": "abc123def",
"files_committed": ["rules/<team>_rules/ai_aws_privilege_escalation.py", "rules/<team>_rules/ai_aws_privilege_escalation.yml"]
}
}
Quality assurance: This subrecipe includes comprehensive error handling for git failures, PR creation issues, and template population problems, providing clear fallback instructions when automation fails.
This workflow implements sophisticated error handling with intelligent stopping points:
Every subrecipe uses a consistent JSON response format:
{
"status": {
"success": boolean,
"error": "Error message if failed",
"error_type": "categorized_error_type"
},
"data": { /* Actual response data */ },
"partial_results": { /* Optional partial data */ }
}
This workflow distinguishes between different types of failures. For example, each subrecipe’s response has an error_code field. When a failure occurs, the LLM categorizes the type of error encountered and surfaces this information to the main recipe so it can make a determination on what to do next.
As an example, rule_creator is configured with these error categories:
response:
json_schema:
type: object
properties:
status:
type: object
properties:
...
error_type:
type: string
enum: ["git_operation_failed", "pr_creation_failed", "template_population_failed", "validation_failed"]
description: "Category of error for debugging purposes"
...
If this subrecipe returns file_creation_failed, we shouldn’t move on to the testing_validator or pr_creator steps.
This fail-fast approach prevents wasted effort on meaningless subsequent steps.
# Create a detection without creating a PR or similar rule/Panther MCP analysis
goose run --recipe recipe.yaml \
--params skip_similar_rules_check=true \
--params skip_panther_mcp=true \
--params rule_description="Create an AWS CloudTrail detection to identify new regions being enabled without any associated errorCodes"
# Full workflow with schema/event sampling and automatic PR creation
goose run --recipe recipe.yaml --interactive \
--params skip_similar_rules_check=true \
--params skip_panther_mcp=false \
--params create_pr=true \
--params rule_description="Create a Panther rule that will detect when the user [email protected] successfully logs in to Okta from a Windows system"
The recipe system ensures compliance with team standards through:
The workflow_setup subrecipe extracts key requirements from AGENTS.md:
ai_ prefix for AI-created rules)The pr_creator subrecipe follows team standards:
ai/<description>)The workflow integrates with Panther's Model Context Protocol (MCP) for:
For Security Teams
For AI Development
For Organizations
Goose's recipe and subrecipe system represents a significant advancement in AI-assisted security detection development. By breaking complex workflows into specialized, composable components, teams can achieve:
The detection creation recipe demonstrates how thoughtful architecture and clear separation of concerns can transform a complex, error-prone manual process into a reliable, automated workflow.
Whether you're building your first Goose recipe or looking to optimize existing workflows, the patterns and principles outlined here provide a solid foundation for successful automation.
--no-verify on git commands.
AGENTS.md) for all AI agents:
.goosehints, CLAUDE.md, .cursor/rules/*, etc., back to this file.AGENTS.md for easier parsing and reuse across agents.