n8n Workflow Comparison

Graph-based workflow similarity comparison using NetworkX and graph edit distance.

Features

Graph Edit Distance: Uses NetworkX's graph edit distance algorithm for accurate structural comparison
Configurable Cost Functions: Customize costs for different types of edits (node/edge insertion, deletion, substitution)
Special Case Handling: Higher penalties for trigger mismatches, similar node types grouped together
Parameter Comparison: Deep comparison of node parameters with configurable ignore rules
External Configuration: YAML/JSON config files for easy customization without code changes (see CONFIGURATION.md)
Built-in Presets: Strict, standard, and lenient comparison modes
Detailed Output: Returns similarity score and top edit operations needed

Installation

This module uses uv for dependency management. No installation is needed - dependencies are automatically managed by uvx.

Prerequisites

Install uv:

bash

# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Install just

bash

# on macOS via homebrew
brew install just
# or gloabl install via NPM
npm install -g rust-just
# or cross platform via curl to DEST
curl --proto '=https' --tlsv1.2 -sSf https://just.systems/install.sh | bash -s -- --to DEST

Usage

CLI Usage

bash

# Using default (standard) configuration
uvx --from . python -m src.compare_workflows generated.json ground_truth.json

# Using a preset
uvx --from . python -m src.compare_workflows generated.json ground_truth.json --preset strict

# Using custom configuration
uvx --from . python -m src.compare_workflows generated.json ground_truth.json --config my-config.yaml

# Output as human-readable summary
uvx --from . python -m src.compare_workflows generated.json ground_truth.json --output-format summary

Python API Usage

python

from config_loader import load_config
from graph_builder import build_workflow_graph
from similarity import calculate_graph_edit_distance
import json

# Load workflows
with open('generated.json') as f:
    generated = json.load(f)
with open('ground_truth.json') as f:
    ground_truth = json.load(f)

# Load configuration
config = load_config('preset:standard')

# Build graphs
g1 = build_workflow_graph(generated, config)
g2 = build_workflow_graph(ground_truth, config)

# Calculate similarity
result = calculate_graph_edit_distance(g1, g2, config)

print(f"Similarity: {result['similarity_score']:.2%}")
print(f"Edit cost: {result['edit_cost']:.1f}")
print(f"Top edits: {len(result['top_edits'])}")

Configuration

📖 For detailed configuration documentation, see CONFIGURATION.md

Built-in Presets

strict: High penalties, exact matching required
standard: Balanced comparison (default)
lenient: Low penalties, focus on structure over details

Quick Start

Create a YAML or JSON file with your custom rules:

yaml

version: "1.0"
name: "my-custom-config"
description: "Custom configuration for my use case"

costs:
  nodes:
    insertion: 10.0
    deletion: 10.0
    substitution:
      same_type: 1.0
      similar_type: 5.0
      different_type: 15.0
      trigger_mismatch: 50.0

  edges:
    insertion: 5.0
    deletion: 5.0
    substitution: 3.0

similarity_groups:
  triggers:
    - "n8n-nodes-base.webhook"
    - "n8n-nodes-base.manualTrigger"

ignore:
  node_types:
    - "n8n-nodes-base.stickyNote"
  global_parameters:
    - "position"
    - "id"

parameter_comparison:
  numeric_tolerance:
    - parameter: "options.temperature"
      tolerance: 0.1
      cost_if_exceeded: 2.0

For comprehensive documentation including:

Complete field reference
Cost configuration strategies
Advanced ignore rules and wildcards
Parameter comparison rules
Exemptions and conditional logic
Real-world examples

See CONFIGURATION.md

Output Format

JSON Output

json

{
  "similarity_score": 0.78,
  "similarity_percentage": "78.0%",
  "edit_cost": 45.0,
  "max_possible_cost": 205.0,
  "top_edits": [
    {
      "type": "node_substitute",
      "description": "Replace 'Manual Trigger' with 'Webhook Trigger'",
      "cost": 25.0,
      "priority": "critical"
    }
  ],
  "metadata": {
    "generated_nodes": 5,
    "ground_truth_nodes": 6
  }
}

Summary Output

============================================================
WORKFLOW COMPARISON SUMMARY
============================================================

Overall Similarity: 78.0%
Edit Cost:          45.0 / 205.0

Configuration: standard
  Standard balanced comparison configuration

Top 3 Required Edits:
------------------------------------------------------------
1. 🔴 [CRITICAL] Cost: 25.0
   Replace 'Manual Trigger' with 'Webhook Trigger'

2. 🟠 [MAJOR] Cost: 10.0
   Add missing 'HTTP Request' tool node

3. 🟡 [MINOR] Cost: 5.0
   Remove connection from 'Agent' to 'Extra Node'

============================================================
✅ PASS - Workflows are sufficiently similar
============================================================

Testing

Run the test suite:

bash

# Install dev dependencies
uv sync --dev

# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov

Algorithm Details

Graph Representation

Each workflow node becomes a graph node with attributes (type, parameters, etc.)
Node and edge get a generated ID based on their position in the workflow
Each workflow connection becomes a directed edge with connection type
Nodes and edges are filtered based on configuration rules

Graph Edit Distance

Uses NetworkX's optimize_graph_edit_distance with custom cost functions:

Node operations: insertion, deletion, substitution
Edge operations: insertion, deletion, substitution
Cost functions consider node types, parameters, and configuration rules

Similarity Score

similarity = 1 - (edit_cost / max_possible_cost)

Where max_possible_cost is the cost of deleting all nodes/edges from g1 and inserting all from g2.

Troubleshooting

Timeout errors

For very large or complex workflows, the comparison may timeout. Consider:

Using a lenient preset to reduce computation
Simplifying the workflow structure
Increasing the timeout in the TypeScript wrapper

Configuration errors

Ensure YAML/JSON syntax is valid
Check that node types and parameter paths are correct
Use --verbose flag to see detailed configuration info