optional-skills/creative/kanban-video-orchestrator/references/examples.md
Six concrete pipelines covering different video styles. Each shows the team composition, task graph, and skill/tool choices the orchestrator would make for that brief. These are illustrative, not templates — adapt to the actual brief.
Brief: A 90-second noir-style short. A detective walks through a rainy city. Voiceover narration. AI-generated visuals.
Team:
director — vision, decomposition, approvalwriter — script + voiceover copy (loads humanizer for natural voice)storyboarder — beat-by-beat shot list (loads excalidraw)image-generator — generates each shot's still via local ComfyUI workflows
(loads comfyui)image-to-video-generator — animates each still (Runway/Kling, OR
ComfyUI's AnimateDiff/WAN workflows via comfyui)voice-talent — narration via ElevenLabsaudio-mixer — VO + ambient padeditor — assembly + transitionsreviewer — final QATask graph:
T0 director decompose
T1 writer script + voiceover.md (parent: T0)
T2 storyboarder shot list with framing per beat (parent: T1)
T3 image-generator one still per shot (~12 shots) (parent: T2)
T4 image-to-video animate each still (parent: T3)
T5 voice-talent generate narration audio (parent: T1)
T6 audio-mixer mix VO + ambient (parent: T5)
T7 editor cut + transitions + audio mux (parents: T4, T6)
T8 reviewer final QA (parent: T7)
Key choices:
comfyui skill is preferred over external API for
cost/control — but external APIs are fine if ComfyUI isn't installededitor profile is ffmpeg-only, no Hermes skill required beyond
kanban-workerstoryboard.excalidraw alongside the markdownBrief: A 30-second product teaser for a developer tool. Shows code + terminal + UI screen recordings, voiceover, CTA at end. Square 1:1.
Team:
directorcopywriter — taglines, voiceover script, CTA (loads humanizer)concept-artist — style frames (loads claude-design for UI mockups)renderer-motion-graphics — animated UI sequences (Remotion CLI)renderer-ascii — terminal-style demo scenes (loads ascii-video)voice-talent — VO via ElevenLabseditor — assembly + brand-color treatmentaudio-mixer — VO + light music bedcaptioner — burned subtitles for muted-autoplay platformsmasterer — produces 1:1 + 9:16 + 16:9 variantsTask graph:
T0 director decompose
T1 copywriter copy.md + cta + vo script (parent: T0)
T2 concept-artist visual-spec.md + style frames (parent: T1)
T3a renderer-motion-graphics scene 1: UI sequence (parent: T2)
T3b renderer-ascii scene 2: terminal demo (parent: T2)
T3c renderer-motion-graphics scene 3: feature highlight (parent: T2)
T3d renderer-motion-graphics scene 4: CTA card (parent: T2)
T4 voice-talent narration (parent: T1)
T5 audio-mixer VO + music bed (parent: T4)
T6 editor cut + transitions (parents: T3*, T5)
T7 captioner SRT + burned subtitles (parent: T6)
T8 masterer 1:1, 9:16, 16:9 variants (parent: T7)
Key choices:
claude-design skill for UI mockups maps directly to the product video idiomBrief: A 3-minute music video for a provided lo-fi hip-hop track. Visuals should pulse with the beat. Generative + ASCII hybrid. Vertical 9:16.
Team:
directormusic-supervisor — analyze track, emit audio/beats.json (loads songsee)storyboarder — beat-aligned shot list (loads excalidraw)renderer-ascii — ASCII scenes synced to bass kicks (loads ascii-video)renderer-p5js — generative particle scenes synced to highs (loads p5js)editor — beat-cut assembly using beats.jsonreviewer — sync QATask graph:
T0 director decompose
T1 music-supervisor analyze track → beats.json + spectrogram (parent: T0)
T2 storyboarder shot list aligned to beats (parents: T1, T0)
T3a renderer-ascii scene 1: bass-driven ASCII (parent: T2)
T3b renderer-p5js scene 2: high-end particle field (parent: T2)
... (more scenes)
T4 editor cut to beats + mux track (parents: T3*, T1)
T5 reviewer sync QA + final approval (parent: T4)
Key choices:
music-supervisor runs FIRST — beats.json gates the rendererseditor uses beats.json directly to align cuts to bass kicksascii-video + p5js) for visual varietyBrief: A 2-minute explainer of an algorithm. 3Blue1Brown-style. Animated diagrams, equations, narration. Square 1:1.
Team:
directorwriter — narration script (loads humanizer)cinematographer — visual spec (loads manim-video)renderer-manim — all animated scenes (loads manim-video)voice-talent — narration via ElevenLabseditor — assembly + audio muxcaptioner — burned subtitlesTask graph:
T0 director decompose
T1 writer script + narration (parent: T0)
T2 cinematographer visual spec for all scenes (parent: T1)
T3a-Tn renderer-manim scenes 1..N (parents: T2)
T4 voice-talent narration audio (parent: T1)
T5 editor cut + mux (parents: T3*, T4)
T6 captioner SRT + burn (parent: T5)
Key choices:
manim-video skill drives both the cinematographer (visual language) and
the renderer (actual scene production)manim-video skill's reference docs (animation-design-thinking,
scene-planning, equations) auto-load when needed via the renderer's pinned skillBrief: A 60-second pure-ASCII video reactive to an existing track. No voiceover, no other tools. Square 1:1.
Team:
directormusic-supervisor — track analysis (loads songsee)renderer-ascii — all visuals (loads ascii-video)editor — assembly + audio muxTask graph:
T0 director decompose
T1 music-supervisor analyze track (parent: T0)
T2a renderer-ascii scene 1 (parents: T1, T0)
T2b renderer-ascii scene 2 (parents: T1, T0)
T2c renderer-ascii scene 3 (parents: T1, T0)
T3 editor stitch + mux audio (parents: T2*)
Key choices:
renderer-ascii profile because the ascii-video
skill covers everythingThis example illustrates the rule: don't over-decompose. Three scenes through one renderer is fine. Don't spawn three renderer profiles.
Brief: A 2-minute audio-reactive visual for a gallery installation. Driven by an audio input feed. TouchDesigner-based. 16:9 4K.
Team:
directorcinematographer — visual language spec (loads touchdesigner-mcp)renderer-touchdesigner — all visuals + record-to-disk
(loads touchdesigner-mcp)audio-mixer — final loudness pass on the captured audio (optional if
pre-mixed source)editor — assemble final clip from TouchDesigner recordingreviewer — visual QATask graph:
T0 director decompose
T1 cinematographer TD operator graph spec (parent: T0)
T2 renderer-touchdesigner build TD network + record output (parent: T1)
T3 editor trim + audio mux (parent: T2)
T4 reviewer final QA (parent: T3)
Key choices:
touchdesigner-mcp controls a running TouchDesigner instance — the
cinematographer designs the operator graph, renderer builds itWhen the user describes a video, look for these signals to map to an example:
renderer-comic (baoyu-comic skill)renderer-pixel (pixel-art skill)renderer-3d (blender-mcp)renderer-p5js (p5js)renderer-comfyui
(comfyui) for both stills and image-to-videoThe actual team should be derived from the specific brief — these examples are starting points, not endpoints.