Back to Hermes Agent

ComfyUI Workflow JSON Format

skills/creative/comfyui/references/workflow-format.md

2026.6.57.8 KB
Original Source

ComfyUI Workflow JSON Format

Two Formats — Only API Format Is Executable

API format is required for /api/prompt and every script in this skill. The web UI also produces an "editor format" used for visual editing, which cannot be submitted directly.

API Format

Top-level keys are string node IDs. Each node has class_type and inputs:

json
{
  "3": {
    "class_type": "KSampler",
    "inputs": {
      "seed": 156680208700286,
      "steps": 20,
      "cfg": 8,
      "sampler_name": "euler",
      "scheduler": "normal",
      "denoise": 1.0,
      "model": ["4", 0],
      "positive": ["6", 0],
      "negative": ["7", 0],
      "latent_image": ["5", 0]
    },
    "_meta": {"title": "KSampler"}
  },
  "4": {
    "class_type": "CheckpointLoaderSimple",
    "inputs": {"ckpt_name": "v1-5-pruned-emaonly.safetensors"}
  }
}

Detection: every top-level value has class_type. The skill's _common.is_api_format() does this check.

Editor Format (not directly executable)

Has nodes[] and links[] arrays — the visual graph. To convert: open in ComfyUI's web UI and use Workflow → Export (API) (newer UI) or the "Save (API Format)" button (older UI).

Detection: top-level has "nodes" and "links" keys.

json
"inputs": {
  "text": "a cat",         // literal — modifiable
  "seed": 42,              // literal — modifiable
  "clip": ["4", 1]         // link — wiring; do NOT overwrite
}

Links are length-2 arrays of [upstream_node_id, output_slot]. The skill's parameter injector refuses to overwrite a link with a literal (logs a warning and skips).

Common Node Types and Their Controllable Parameters

The full catalog lives in scripts/_common.py (PARAM_PATTERNS and MODEL_LOADERS). Highlights:

Text Prompts

Node ClassKey Fields
CLIPTextEncodetext
CLIPTextEncodeSDXLtext_g, text_l, width, height
CLIPTextEncodeFluxclip_l, t5xxl, guidance

To distinguish positive from negative the skill traces KSampler.negative back through Reroute / Primitive nodes to the source CLIPTextEncode. Falls back to _meta.title heuristics ("negative", "neg", "anti").

Sampling

Node ClassKey Fields
KSamplerseed, steps, cfg, sampler_name, scheduler, denoise
KSamplerAdvancednoise_seed, steps, cfg, start_at_step, end_at_step
SamplerCustomnoise_seed, cfg, sampler, sigmas
SamplerCustomAdvancednoise_seed (via RandomNoise input)
RandomNoisenoise_seed
BasicSchedulersteps, scheduler, denoise
KSamplerSelectsampler_name
BasicGuider / CFGGuidercfg
ModelSamplingFluxmax_shift, base_shift, width, height
SDTurboSchedulersteps, denoise

Latent / Dimensions

Node ClassKey Fields
EmptyLatentImagewidth, height, batch_size
EmptySD3LatentImagewidth, height, batch_size
EmptyHunyuanLatentVideowidth, height, length, batch_size
EmptyMochiLatentVideowidth, height, length, batch_size
EmptyLTXVLatentVideowidth, height, length, batch_size

Model Loading

Node ClassKey FieldsFolder
CheckpointLoaderSimpleckpt_namecheckpoints
LoraLoaderlora_name, strength_model, strength_cliploras
LoraLoaderModelOnlylora_name, strength_modelloras
VAELoadervae_namevae
ControlNetLoadercontrol_net_namecontrolnet
CLIPLoaderclip_nameclip
DualCLIPLoaderclip_name1, clip_name2clip
TripleCLIPLoaderclip_name1/2/3clip
UNETLoaderunet_nameunet
DiffusionModelLoadermodel_namediffusion_models
UpscaleModelLoadermodel_nameupscale_models
IPAdapterModelLoaderipadapter_fileipadapter
ADE_AnimateDiffLoaderWithContextmodel_name, motion_scaleanimatediff_models

Image Input/Output

Node ClassKey Fields
LoadImageimage (server-side filename, after upload)
LoadImageMaskimage, channel (red / green / blue / alpha)
VAEEncode / VAEDecode(no controllable fields)
VAEEncodeForInpaintgrow_mask_by
SaveImagefilename_prefix
VHS_VideoCombineframe_rate, format, filename_prefix, loop_count, pingpong

ControlNet

Node ClassKey Fields
ControlNetApplystrength
ControlNetApplyAdvancedstrength, start_percent, end_percent

IPAdapter (community pack comfyui_ipadapter_plus)

Node ClassKey Fields
IPAdapterAdvancedweight, start_at, end_at
IPAdapterweight

Embeddings (referenced inside prompt strings)

ComfyUI scans prompt text for embedding:NAME syntax. The skill's _common.iter_embedding_refs() extracts these as model dependencies.

text
"a beautiful cat, embedding:goodvibes:1.2, embedding:art-style"

extract_schema.py and check_deps.py surface these in embedding_dependencies / missing_embeddings.

Parameter Injection Pattern

python
import json, copy

with open("workflow_api.json") as f:
    workflow = json.load(f)

wf = copy.deepcopy(workflow)
wf["6"]["inputs"]["text"] = "a beautiful sunset"
wf["7"]["inputs"]["text"] = "ugly, blurry"
wf["3"]["inputs"]["seed"] = 42
wf["3"]["inputs"]["steps"] = 30
wf["5"]["inputs"]["width"] = 1024
wf["5"]["inputs"]["height"] = 1024

scripts/extract_schema.py automates discovering which node IDs/fields correspond to which user-facing parameters. It returns a parameters dict that run_workflow.py reads to inject values from --args.

Identifying Controllable Parameters (Heuristics)

For unknown workflows:

  1. Prompt text — any CLIPTextEncode.text. Use connection tracing back from KSampler.positive / .negative to disambiguate (don't trust meta-title alone).
  2. SeedKSampler.seed / KSamplerAdvanced.noise_seed / RandomNoise.noise_seed.
  3. DimensionsEmpty*LatentImage.width/height (must be multiples of 8).
  4. Steps / CFGKSampler.steps, KSampler.cfg. Steps 20–50 typical. CFG 5–15 typical (Flux uses guidance, not CFG).
  5. Model / checkpointCheckpointLoaderSimple.ckpt_name. Filename must match an installed file exactly.
  6. LoRALoraLoader.lora_name, .strength_model.
  7. Images for img2img / inpaintLoadImage.image. Server-side filename after upload.
  8. DenoiseKSampler.denoise. 0.0–1.0; 1.0 = ignore input image, 0.0 = pass through. Sweet spot for img2img: 0.4–0.7.

Output Nodes

Output is produced by these node types. The skill's OUTPUT_NODES set extends to common community packs.

NodeOutput KeyContent
SaveImageimagesList of {filename, subfolder, type}
PreviewImageimagesTemporary preview (not saved)
VHS_VideoCombinegifs (older) or videos/video (newer cloud)Video file refs
SaveAudioaudioAudio file refs
SaveAnimatedWEBP / SaveAnimatedPNGimagesAnimated images
Save3D3d3D asset refs

After execution, fetch outputs from /history/{prompt_id} (local) or /api/jobs/{prompt_id} (cloud) → outputs{node_id}{key}.

Wrapper Variants

Some saved JSON files wrap the workflow under a "prompt" key (matching the /api/prompt payload shape). The skill's _common.unwrap_workflow() handles this — pass any of:

  • raw API format: {"3": {...}, "4": {...}}
  • wrapped: {"prompt": {"3": {...}}, "client_id": "..."}

It rejects editor format with a clear error and a re-export instruction.