Configuration Format Documentation

This document provides comprehensive documentation for the workflow comparison configuration format.

Overview
Configuration File Format
Top-Level Fields
Cost Configuration
Similarity Groups
Ignore Rules
Parameter Comparison Rules
Exemptions
Connection Rules
Output Configuration
Examples

Overview

Configuration files can be written in either YAML or JSON format. YAML is recommended for readability and easier maintenance. The configuration controls:

Cost weights for different types of graph edits
Similarity groups to treat similar node types as equivalent
Ignore rules to exclude certain nodes or parameters
Parameter comparison rules for flexible matching
Exemptions for optional nodes
Output formatting preferences

Configuration File Format

YAML Example

yaml

version: "1.0"
name: "my-config"
description: "My custom configuration"

costs:
  nodes:
    insertion: 10.0
    deletion: 10.0
    # ... more costs

similarity_groups:
  triggers:
    - "n8n-nodes-base.webhook"
    - "n8n-nodes-base.manualTrigger"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.1

output:
  max_edits: 15

JSON Example

json

{
  "version": "1.0",
  "name": "my-config",
  "description": "My custom configuration",
  "costs": {
    "nodes": {
      "insertion": 10.0,
      "deletion": 10.0
    }
  },
  "similarity_groups": {
    "triggers": [
      "n8n-nodes-base.webhook",
      "n8n-nodes-base.manualTrigger"
    ]
  }
}

Top-Level Fields

`version` (string, required)

Configuration format version. Currently "1.0".

yaml

version: "1.0"

`name` (string, optional)

A unique identifier for this configuration.

yaml

name: "my-custom-config"

`description` (string, optional)

Human-readable description of what this configuration does.

yaml

description: "Strict comparison for production workflows"

Cost Configuration

The costs section defines penalties for different graph edit operations. These costs directly impact the similarity score.

Structure

yaml

costs:
  nodes:
    insertion: <float>
    deletion: <float>
    substitution:
      same_type: <float>
      similar_type: <float>
      different_type: <float>
      trigger_mismatch: <float>

  edges:
    insertion: <float>
    deletion: <float>
    substitution: <float>

  parameters:
    mismatch_weight: <float>
    nested_weight: <float>

Node Costs

`costs.nodes.insertion` (float, default: 10.0)

Cost penalty when a node exists in the ground truth but is missing from the generated workflow.

Use case: Set higher for stricter matching (e.g., 15.0), lower for lenient matching (e.g., 5.0).

yaml

costs:
  nodes:
    insertion: 10.0

`costs.nodes.deletion` (float, default: 10.0)

Cost penalty when a node exists in the generated workflow but not in the ground truth.

Use case: Set higher to penalize extra nodes more severely.

yaml

costs:
  nodes:
    deletion: 15.0

`costs.nodes.substitution.same_type` (float, default: 1.0)

Cost when two nodes have the same type but different parameters.

Use case:

Low values (0.5-1.0): Allow parameter variations
High values (2.0-5.0): Require exact parameter matches

yaml

costs:
  nodes:
    substitution:
      same_type: 1.0

`costs.nodes.substitution.similar_type` (float, default: 5.0)

Cost when two nodes are in the same similarity group (see Similarity Groups).

Example: Replacing lmChatOpenAi with lmChatAnthropic (both are LLMs).

yaml

costs:
  nodes:
    substitution:
      similar_type: 5.0

`costs.nodes.substitution.different_type` (float, default: 15.0)

Cost when replacing a node with a completely different type.

Example: Replacing httpRequest with webhook.

yaml

costs:
  nodes:
    substitution:
      different_type: 15.0

`costs.nodes.substitution.trigger_mismatch` (float, default: 50.0)

Special high-cost penalty for trigger node mismatches. Triggers are critical to workflow functionality.

Use case: Keep this high (50.0-100.0) to ensure trigger correctness.

yaml

costs:
  nodes:
    substitution:
      trigger_mismatch: 50.0

Edge Costs

`costs.edges.insertion` (float, default: 5.0)

Cost for a missing connection between nodes.

yaml

costs:
  edges:
    insertion: 5.0

`costs.edges.deletion` (float, default: 5.0)

Cost for an extra connection that shouldn't exist.

yaml

costs:
  edges:
    deletion: 5.0

`costs.edges.substitution` (float, default: 3.0)

Cost for changing the type or properties of a connection.

yaml

costs:
  edges:
    substitution: 3.0

Parameter Costs

`costs.parameters.mismatch_weight` (float, default: 0.5)

Weight multiplier for parameter mismatches within a node.

Formula: parameter_cost = base_cost * mismatch_weight * num_mismatches

yaml

costs:
  parameters:
    mismatch_weight: 0.5

`costs.parameters.nested_weight` (float, default: 0.3)

Weight multiplier for nested/deep parameter differences.

Use case: Set lower to be more forgiving about deep configuration differences.

yaml

costs:
  parameters:
    nested_weight: 0.3

Similarity Groups

Similarity groups define sets of node types that should be considered "similar" rather than "different" when substituted. Nodes within the same group incur the similar_type cost instead of different_type.

Structure

yaml

similarity_groups:
  <group_name>:
    - "<node_type_1>"
    - "<node_type_2>"
    - "<node_type_3>"

Example

yaml

similarity_groups:
  triggers:
    - "n8n-nodes-base.webhook"
    - "n8n-nodes-base.manualTrigger"
    - "n8n-nodes-base.scheduleTrigger"

  ai_llms:
    - "@n8n/n8n-nodes-langchain.lmChatOpenAi"
    - "@n8n/n8n-nodes-langchain.lmChatAnthropic"
    - "@n8n/n8n-nodes-langchain.lmChatOllama"
    - "@n8n/n8n-nodes-langchain.lmChatMistralCloud"

  http_requests:
    - "n8n-nodes-base.httpRequest"
    - "@n8n/n8n-nodes-langchain.toolHttpRequest"

Common Similarity Groups

AI Agents

yaml

ai_agents:
  - "n8n-nodes-langchain.agent"
  - "@n8n/n8n-nodes-langchain.agent"
  - "n8n-nodes-langchain.basicAgent"

AI Tools

yaml

ai_tools:
  - "@n8n/n8n-nodes-langchain.toolHttpRequest"
  - "@n8n/n8n-nodes-langchain.toolCalculator"
  - "@n8n/n8n-nodes-langchain.toolCode"
  - "@n8n/n8n-nodes-langchain.toolWorkflow"

Ignore Rules

Ignore rules allow you to exclude certain nodes or parameters from comparison. This is useful for:

UI-only elements that don't affect workflow execution
Metadata fields like IDs and positions
Parameters that vary legitimately across implementations

Structure

yaml

ignore:
  node_types: [...]
  nodes: [...]
  global_parameters: [...]
  node_type_parameters: {...}
  parameter_paths: [...]

`ignore.node_types` (list of strings)

Completely ignore nodes of specific types.

Use case: Ignore decorative nodes like sticky notes.

yaml

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"
    - "n8n-nodes-base.comment"

`ignore.nodes` (list of objects)

Flexible rules for ignoring nodes based on name patterns or other criteria.

Structure:

yaml

ignore:
  nodes:
    - pattern: "<regex_pattern>"
      reason: "Why this is ignored"
    - name: "<exact_node_name>"
      reason: "Why this is ignored"
    - node_type: "<node_type>"
      reason: "Why this is ignored"

Example:

yaml

ignore:
  nodes:
    - pattern: "^Temp.*"
      reason: "Temporary debugging nodes"
    - name: "Development Only"
      reason: "Used only in development"

`ignore.global_parameters` (list of strings)

Parameter names to ignore across all node types.

Common use case: Ignore UI-specific metadata.

yaml

ignore:
  global_parameters:
    - "position"
    - "id"
    - "notes"
    - "notesInFlow"
    - "color"
    - "disabled"

`ignore.node_type_parameters` (object)

Parameters to ignore for specific node types.

Structure:

yaml

ignore:
  node_type_parameters:
    "<node_type>":
      - "<parameter_path_1>"
      - "<parameter_path_2>"

Example:

yaml

ignore:
  node_type_parameters:
    "@n8n/n8n-nodes-langchain.agent":
      - "options.systemMessage"  # Allow different prompts
      - "options.maxIterations"   # Allow iteration variance

    "n8n-nodes-base.httpRequest":
      - "options.timeout"  # Timeout can vary by environment

`ignore.parameter_paths` (list of strings)

Ignore parameters using path patterns. Supports wildcards:

* - matches any single path segment
** - matches any number of path segments

Example:

yaml

ignore:
  parameter_paths:
    - "options.*.timeout"      # Ignore timeout in any option
    - "**.temperature"         # Ignore temperature at any nesting level
    - "options.advanced.**"    # Ignore all advanced options

Parameter Comparison Rules

Parameter comparison rules allow for flexible matching of specific parameters, such as numeric tolerance or semantic similarity.

Structure

yaml

parameter_comparison:
  fuzzy_match: [...]
  numeric_tolerance: [...]

Fuzzy Match Rules

For semantic or approximate text matching.

Structure:

yaml

parameter_comparison:
  fuzzy_match:
    - parameter: "<parameter_path>"
      type: "semantic"
      threshold: <float>
      cost_if_below: <float>
      options:
        <key>: <value>

Example:

yaml

parameter_comparison:
  fuzzy_match:
    - parameter: "options.systemMessage"
      type: "semantic"
      threshold: 0.8
      cost_if_below: 3.0
      options:
        model: "sentence-transformers"

Numeric Tolerance Rules

For numeric parameters that should be "close enough" rather than exact.

Structure:

yaml

parameter_comparison:
  numeric_tolerance:
    - parameter: "<parameter_path>"
      tolerance: <float>
      cost_if_exceeded: <float>

Example:

yaml

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.1
      cost_if_exceeded: 2.0

    - parameter: "options.maxTokens"
      tolerance: 100
      cost_if_exceeded: 1.0

    - parameter: "options.topP"
      tolerance: 0.05
      cost_if_exceeded: 1.5

How it works:

If |value1 - value2| <= tolerance, parameters are considered equal (no cost)
If |value1 - value2| > tolerance, cost_if_exceeded is added to the edit cost

Wildcard Support

Parameter paths support wildcards:

yaml

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.*.temperature"
      tolerance: 0.1
      cost_if_exceeded: 2.0

This applies to options.llm.temperature, options.model.temperature, etc.

Exemptions

Exemptions reduce penalties for certain nodes that are optional or conditionally required.

Structure

yaml

exemptions:
  optional_in_generated: [...]
  optional_in_ground_truth: [...]

`exemptions.optional_in_generated` (list of objects)

Nodes that can be missing from the generated workflow without full penalty.

Use case: Ground truth has optional nodes that aren't critical.

Structure:

yaml

exemptions:
  optional_in_generated:
    - name_pattern: "<regex>"
      penalty: <float>
      reason: "Why this is optional"
    - node_type: "<node_type>"
      penalty: <float>
      when:
        <condition_key>: <condition_value>

Example:

yaml

exemptions:
  optional_in_generated:
    - node_type: "@n8n/n8n-nodes-langchain.memoryBufferWindow"
      penalty: 2.0
      reason: "Memory is optional for simple workflows"

    - name_pattern: ".*Debug.*"
      penalty: 1.0
      reason: "Debug nodes are optional in production"

`exemptions.optional_in_ground_truth` (list of objects)

Nodes that can exist in the generated workflow as extras without full penalty.

Use case: Generated workflow includes helpful but non-essential nodes.

Example:

yaml

exemptions:
  optional_in_ground_truth:
    - node_type: "n8n-nodes-base.set"
      penalty: 3.0
      reason: "Set nodes for data transformation are okay to add"

    - node_type: "@n8n/n8n-nodes-langchain.toolCalculator"
      penalty: 2.0
      reason: "Extra tools are acceptable"

Conditional Exemptions

Use the when clause to apply exemptions conditionally:

yaml

exemptions:
  optional_in_generated:
    - node_type: "n8n-nodes-base.errorTrigger"
      penalty: 1.0
      when:
        disabled: true
      reason: "Disabled error handlers are optional"

Connection Rules

Rules for handling workflow connections (edges).

Structure

yaml

connections:
  ignore_connection_types: [...]
  equivalent_types: [...]

`connections.ignore_connection_types` (list of strings)

Connection types to completely ignore during comparison.

yaml

connections:
  ignore_connection_types:
    - "main"  # Ignore main data flow connections

`connections.equivalent_types` (list of lists)

Define groups of connection types that should be treated as equivalent.

Example:

yaml

connections:
  equivalent_types:
    - ["main", "ai"]
    - ["error", "fallback"]

This means:

main and ai connections are interchangeable
error and fallback connections are interchangeable

Output Configuration

Controls how results are formatted and presented.

Structure

yaml

output:
  max_edits: <integer>
  group_by: "<grouping_strategy>"
  include_explanations: <boolean>
  include_suggestions: <boolean>

`output.max_edits` (integer, default: 15)

Maximum number of edit operations to return in the results.

Use case:

Set higher (e.g., 20-50) for detailed debugging
Set lower (e.g., 5-10) for quick summaries

yaml

output:
  max_edits: 15

`output.group_by` (string, default: "priority")

How to group edit operations in the output.

Options:

"priority": Group by priority (critical, major, minor)
"type": Group by edit type (node, edge, parameter)
"cost": Order by cost (highest first)

yaml

output:
  group_by: "priority"

`output.include_explanations` (boolean, default: true)

Include detailed explanations for each edit operation.

yaml

output:
  include_explanations: true

`output.include_suggestions` (boolean, default: true)

Include suggestions for how to fix issues.

yaml

output:
  include_suggestions: true

Examples

Example 1: Strict Production Configuration

For production workflows where exact matching is critical:

yaml

version: "1.0"
name: "production-strict"
description: "Strict matching for production workflows"

costs:
  nodes:
    insertion: 20.0
    deletion: 20.0
    substitution:
      same_type: 0.5
      similar_type: 10.0
      different_type: 30.0
      trigger_mismatch: 100.0

  edges:
    insertion: 10.0
    deletion: 10.0
    substitution: 5.0

  parameters:
    mismatch_weight: 1.0
    nested_weight: 0.8

similarity_groups:
  triggers:
    - "n8n-nodes-base.webhook"
    - "n8n-nodes-base.scheduleTrigger"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"

  global_parameters:
    - "position"
    - "id"

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.05
      cost_if_exceeded: 5.0

output:
  max_edits: 20
  group_by: "priority"
  include_explanations: true
  include_suggestions: true

Example 2: Lenient Development Configuration

For development workflows where flexibility is needed:

yaml

version: "1.0"
name: "development-lenient"
description: "Lenient matching for development and testing"

costs:
  nodes:
    insertion: 5.0
    deletion: 5.0
    substitution:
      same_type: 1.0
      similar_type: 3.0
      different_type: 8.0
      trigger_mismatch: 20.0

  edges:
    insertion: 2.0
    deletion: 2.0
    substitution: 1.0

  parameters:
    mismatch_weight: 0.3
    nested_weight: 0.1

similarity_groups:
  ai_llms:
    - "@n8n/n8n-nodes-langchain.lmChatOpenAi"
    - "@n8n/n8n-nodes-langchain.lmChatAnthropic"
    - "@n8n/n8n-nodes-langchain.lmChatOllama"

  ai_tools:
    - "@n8n/n8n-nodes-langchain.toolHttpRequest"
    - "@n8n/n8n-nodes-langchain.toolCalculator"
    - "@n8n/n8n-nodes-langchain.toolCode"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"

  global_parameters:
    - "position"
    - "id"
    - "notes"
    - "notesInFlow"
    - "color"
    - "disabled"

  node_type_parameters:
    "@n8n/n8n-nodes-langchain.agent":
      - "options.systemMessage"
      - "options.maxIterations"

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.2
      cost_if_exceeded: 1.0

    - parameter: "options.maxTokens"
      tolerance: 500
      cost_if_exceeded: 0.5

exemptions:
  optional_in_generated:
    - node_type: "@n8n/n8n-nodes-langchain.memoryBufferWindow"
      penalty: 1.0
      reason: "Memory is optional"

  optional_in_ground_truth:
    - node_type: "n8n-nodes-base.set"
      penalty: 2.0
      reason: "Data transformation nodes are okay to add"

output:
  max_edits: 10
  group_by: "priority"
  include_explanations: true
  include_suggestions: false

Example 3: AI-Specific Configuration

Optimized for AI workflow comparisons:

yaml

version: "1.0"
name: "ai-workflows"
description: "Specialized configuration for AI agent workflows"

costs:
  nodes:
    insertion: 10.0
    deletion: 10.0
    substitution:
      same_type: 1.0
      similar_type: 4.0
      different_type: 15.0
      trigger_mismatch: 50.0

  edges:
    insertion: 5.0
    deletion: 5.0
    substitution: 3.0

  parameters:
    mismatch_weight: 0.4
    nested_weight: 0.2

similarity_groups:
  ai_agents:
    - "n8n-nodes-langchain.agent"
    - "@n8n/n8n-nodes-langchain.agent"
    - "n8n-nodes-langchain.basicAgent"

  ai_llms:
    - "@n8n/n8n-nodes-langchain.lmChatOpenAi"
    - "@n8n/n8n-nodes-langchain.lmChatAnthropic"
    - "@n8n/n8n-nodes-langchain.lmChatOllama"
    - "@n8n/n8n-nodes-langchain.lmChatMistralCloud"
    - "@n8n/n8n-nodes-langchain.lmChatAws"

  ai_tools:
    - "@n8n/n8n-nodes-langchain.toolHttpRequest"
    - "@n8n/n8n-nodes-langchain.toolCalculator"
    - "@n8n/n8n-nodes-langchain.toolCode"
    - "@n8n/n8n-nodes-langchain.toolWorkflow"

  memory_types:
    - "@n8n/n8n-nodes-langchain.memoryBufferWindow"
    - "@n8n/n8n-nodes-langchain.memoryConversation"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"

  global_parameters:
    - "position"
    - "id"
    - "notes"
    - "color"

  node_type_parameters:
    "@n8n/n8n-nodes-langchain.agent":
      - "options.systemMessage"  # Prompts can legitimately vary

    "@n8n/n8n-nodes-langchain.lmChatOpenAi":
      - "options.modelName"  # Different models okay

    "@n8n/n8n-nodes-langchain.lmChatAnthropic":
      - "options.modelName"

parameter_comparison:
  numeric_tolerance:
    - parameter: "**.temperature"
      tolerance: 0.15
      cost_if_exceeded: 2.0

    - parameter: "**.maxTokens"
      tolerance: 200
      cost_if_exceeded: 1.0

    - parameter: "**.topP"
      tolerance: 0.1
      cost_if_exceeded: 1.5

exemptions:
  optional_in_generated:
    - node_type: "@n8n/n8n-nodes-langchain.memoryBufferWindow"
      penalty: 2.0
      reason: "Memory is optional for stateless workflows"

    - node_type: "@n8n/n8n-nodes-langchain.toolCalculator"
      penalty: 3.0
      reason: "Calculator tool is optional"

  optional_in_ground_truth:
    - node_type: "@n8n/n8n-nodes-langchain.toolCode"
      penalty: 2.0
      reason: "Additional code tools are acceptable"

output:
  max_edits: 15
  group_by: "priority"
  include_explanations: true
  include_suggestions: true

Loading Configuration

From Preset

bash

# Python CLI
uvx --from . python -m compare_workflows workflow1.json workflow2.json --preset standard

# Python API
from config_loader import load_config
config = load_config("preset:standard")

From File

bash

# Python CLI
uvx --from . python -m compare_workflows workflow1.json workflow2.json --config my-config.yaml

# Python API
from config_loader import load_config
config = load_config("/path/to/my-config.yaml")

Programmatically

python

from config_loader import WorkflowComparisonConfig

# Create from dictionary
config_dict = {
    "version": "1.0",
    "name": "custom",
    "costs": {
        "nodes": {
            "insertion": 12.0
        }
    }
}

config = WorkflowComparisonConfig._from_dict(config_dict)

Best Practices

Start with a preset: Begin with standard, strict, or lenient and customize from there.
Test iteratively: Make small changes and test to understand the impact on similarity scores.
Use similarity groups: Group related node types to avoid harsh penalties for equivalent substitutions.
Ignore UI elements: Always ignore cosmetic parameters like position, id, color, etc.
Set appropriate tolerances: Use numeric tolerances for parameters that shouldn't need exact matches (e.g., temperature, maxTokens).
Document your changes: Use the description field and comments to explain why you made specific choices.
Version control: Keep configuration files in version control alongside your workflows.
Environment-specific configs: Create different configurations for development, testing, and production environments.

Configuration Format Documentation

Configuration Format Documentation

Table of Contents

Overview

Configuration File Format

YAML Example

JSON Example

Top-Level Fields

version (string, required)

name (string, optional)

description (string, optional)

Cost Configuration

Structure

Node Costs

costs.nodes.insertion (float, default: 10.0)

costs.nodes.deletion (float, default: 10.0)

costs.nodes.substitution.same_type (float, default: 1.0)

costs.nodes.substitution.similar_type (float, default: 5.0)

costs.nodes.substitution.different_type (float, default: 15.0)

costs.nodes.substitution.trigger_mismatch (float, default: 50.0)

Edge Costs

costs.edges.insertion (float, default: 5.0)

costs.edges.deletion (float, default: 5.0)

costs.edges.substitution (float, default: 3.0)

Parameter Costs

costs.parameters.mismatch_weight (float, default: 0.5)

costs.parameters.nested_weight (float, default: 0.3)

Similarity Groups

Structure

Example

Common Similarity Groups

AI Agents

AI Tools

Ignore Rules

Structure

ignore.node_types (list of strings)

ignore.nodes (list of objects)

ignore.global_parameters (list of strings)

ignore.node_type_parameters (object)

ignore.parameter_paths (list of strings)

Parameter Comparison Rules

Structure

Fuzzy Match Rules

Numeric Tolerance Rules

Wildcard Support

Exemptions

Structure

exemptions.optional_in_generated (list of objects)

exemptions.optional_in_ground_truth (list of objects)

Conditional Exemptions

Connection Rules

Structure

connections.ignore_connection_types (list of strings)

connections.equivalent_types (list of lists)

Output Configuration

Structure

output.max_edits (integer, default: 15)

output.group_by (string, default: "priority")

output.include_explanations (boolean, default: true)

output.include_suggestions (boolean, default: true)

Examples

Example 1: Strict Production Configuration

Example 2: Lenient Development Configuration

Example 3: AI-Specific Configuration

Loading Configuration

From Preset

From File

Programmatically

Best Practices

Further Reading

`version` (string, required)

`name` (string, optional)

`description` (string, optional)

`costs.nodes.insertion` (float, default: 10.0)

`costs.nodes.deletion` (float, default: 10.0)

`costs.nodes.substitution.same_type` (float, default: 1.0)

`costs.nodes.substitution.similar_type` (float, default: 5.0)

`costs.nodes.substitution.different_type` (float, default: 15.0)

`costs.nodes.substitution.trigger_mismatch` (float, default: 50.0)

`costs.edges.insertion` (float, default: 5.0)

`costs.edges.deletion` (float, default: 5.0)

`costs.edges.substitution` (float, default: 3.0)

`costs.parameters.mismatch_weight` (float, default: 0.5)

`costs.parameters.nested_weight` (float, default: 0.3)

`ignore.node_types` (list of strings)

`ignore.nodes` (list of objects)

`ignore.global_parameters` (list of strings)

`ignore.node_type_parameters` (object)

`ignore.parameter_paths` (list of strings)

`exemptions.optional_in_generated` (list of objects)

`exemptions.optional_in_ground_truth` (list of objects)

`connections.ignore_connection_types` (list of strings)

`connections.equivalent_types` (list of lists)

`output.max_edits` (integer, default: 15)

`output.group_by` (string, default: "priority")

`output.include_explanations` (boolean, default: true)

`output.include_suggestions` (boolean, default: true)