optional-skills/creative/baoyu-article-illustrator/references/workflow.md
If the user provides reference images (local path or URL), the goal is to produce textual descriptions that can be embedded in prompts — image_generate doesn't accept reference-image inputs, and Hermes' text file tools can't read or write binaries.
Tool rules:
| Task | Tool | Notes |
|---|---|---|
| Analyze a reference image | vision_analyze | Accepts URL or local path. Ask for style, palette, composition, subject. |
| Write the text description | write_file | Sidecar .md files only — never try to write_file a PNG/JPG. |
| (Optional) Keep a local copy of the binary | terminal | cp "$src" "{output-dir}/references/NN-ref-{slug}.{ext}" — purely for the record; the skill itself doesn't read the binary. |
| Input Type | Action |
|---|---|
| Image file path provided | vision_analyze → write sidecar .md. Optional terminal cp for a local record. |
| Image URL provided | vision_analyze with the URL → write sidecar .md. |
| Image in conversation (no path, no URL) | Ask via clarify for a path or URL, or for a verbal description. |
| User can't provide either | Extract style/palette verbally from the user → write references/extracted-style.md. Do NOT add references: to prompt frontmatter. |
Procedure (when a path/URL is available):
vision_analyze(image_url=..., question="Describe the style, color palette (with hex approximations), composition, and subject so this can be used as a style/palette reference for another illustration.").{output-dir}/references/NN-ref-{slug}.md via write_file with the description.terminal with cp (or curl -sSL -o ... for URLs) to keep a local binary copy. Not required by the skill.direct / style / palette. In Step 5.1 the description gets appended to the prompt body.Sidecar File Format:
---
ref_id: NN
source: "<original path or URL>"
local_copy: "NN-ref-{slug}.png" # omit if no copy made
usage_hint: style # direct | style | palette
---
[vision_analyze description — colors, style, composition, subject]
| Input | Output Directory | Source-save path |
|---|---|---|
| Article file path | {article-dir}/imgs/ (default) | — (read article via read_file) |
| Pasted content | illustrations/{topic-slug}/ (cwd) | source-{slug}.{ext} (save via write_file) |
If the user explicitly asked for a different layout (e.g., images in the article's folder, or an illustrations/ subdirectory), honor that.
| Analysis | Description |
|---|---|
| Content type | Technical / Tutorial / Methodology / Narrative |
| Illustration purpose | information / visualization / imagination |
| Core arguments | 2-5 main points to visualize |
| Visual opportunities | Positions where illustrations add value |
| Recommended type | Based on content signals and purpose |
| Recommended density | Based on length and complexity |
Save analysis to {output-dir}/analysis.md using write_file.
CRITICAL: If the article uses metaphors (e.g., "电锯切西瓜"), do NOT illustrate literally. Visualize the underlying concept.
Illustrate:
Do NOT Illustrate:
For each reference image (use the vision_analyze description from Step 1):
| Analysis | Description |
|---|---|
| Visual characteristics | Style, colors, composition |
| Content/subject | What the reference depicts |
| Suitable positions | Which sections match this reference |
| Style match | Which illustration types/styles align |
| Usage recommendation | direct / style / palette |
| Usage | When to Use | How it's applied in Step 5.1 |
|---|---|---|
direct | Reference matches desired output closely | Paste the description (composition + subject + style + palette) into the prompt body |
style | Extract visual style characteristics only | Append style traits to prompt body |
palette | Extract color scheme only | Append extracted hex colors to prompt body |
Note: image_generate does not accept reference-image inputs under any usage type. Everything is mediated through the vision_analyze description.
Use the clarify tool. Since clarify handles one question at a time, ask the most important question first. Skip any question the user already answered in their request.
Based on Step 2 content analysis, recommend a preset first (sets both type & style). Look up style-presets.md "Content Type → Preset Recommendations" table.
If user picks a preset → skip Q3 (type & style both resolved). If user picks a type → Q3 is required.
Present Core Styles first:
Core Styles (simplified selection):
| Core Style | Maps To | Best For |
|---|---|---|
minimal-flat | notion | General, knowledge sharing, SaaS |
sci-fi | blueprint | AI, frontier tech, system design |
hand-drawn | sketch/warm | Relaxed, reflective, casual |
editorial | editorial | Processes, data, journalism |
scene | warm/watercolor | Narratives, emotional, lifestyle |
poster | screen-print | Opinion, editorial, cultural, cinematic |
Style selection based on Type × Style compatibility matrix (styles.md).
In Step 5, read styles/<style>.md for visual elements and rendering rules.
If the preset did not specify a palette, offer:
macaron — soft pastel blocks on warm creamwarm — warm earth tones, no cool colorsneon — vibrant neon on dark backgroundsSkip if: preset already resolved palette, or user specified a palette in the request.
See Palette Gallery in styles.md and full specs in palettes/<palette>.md.
If the article language is different from the user's conversational language, ask which to use:
Skip if: languages match, or the user already specified in the request.
When presenting the outline preview to the user, show reference assignments:
Reference Images:
| Ref | Filename | Recommended Usage |
|-----|----------|-------------------|
| 01 | 01-ref-diagram.png | direct → Illustration 1, 3 |
| 02 | 02-ref-chart.png | palette → Illustration 2 |
Save as {output-dir}/outline.md using write_file:
---
type: infographic
density: balanced
style: blueprint
image_count: 4
references: # Only if references provided
- ref_id: 01
filename: 01-ref-diagram.png
description: "Technical diagram showing system architecture"
- ref_id: 02
filename: 02-ref-chart.png
description: "Color chart with brand palette"
---
## Illustration 1
**Position**: [section] / [paragraph]
**Purpose**: [why this helps]
**Visual Content**: [what to show]
**Type Application**: [how type applies]
**References**: [01] # Optional: list ref_ids used
**Reference Usage**: direct # direct | style | palette
**Filename**: 01-infographic-concept-name.png
## Illustration 2
...
Backup rule: If outline.md exists, rename to outline-backup-YYYYMMDD-HHMMSS.md before writing.
Requirements:
BLOCKING: Every illustration must have a saved prompt file before any image is generated.
For each illustration in the outline:
{output-dir}/prompts/NN-{type}-{slug}.md via write_file---
illustration_id: 01
type: infographic
style: custom-flat-vector
---
styles/<style>.md (via read_file) for visual elements, style rules, and rendering instructionspalettes/<palette>.md for colors and background. Palette colors replace the style's default Color Palette. If no palette specified, use the style's built-in colors.Layout: Describe overall composition (grid / radial / hierarchical / left-right / top-down)ZONES: Describe each visual area with specific content, not vague descriptionsLABELS: Use actual numbers, terms, metrics, quotes from the article — NOT generic placeholdersCOLORS: Specify hex codes from palette (or style default) with semantic meaningSTYLE: Describe line treatment, texture, mood, character rendering per style rulesASPECT: Specify ratio (e.g., 16:9)prompts/NN-{type}-{slug}-backup-YYYYMMDD-HHMMSS.mdCRITICAL - References in Frontmatter:
references field if a sidecar .md description exists in {output-dir}/references/read_file on the .md)Read the vision_analyze description from the sidecar references/NN-ref-{slug}.md (via read_file) and embed it in the prompt body. image_generate never receives the binary.
| Usage | Action |
|---|---|
direct | Paste the full reference description (composition, subject, style, palette) into the prompt body |
style | Append only the style traits: "Style: clean lines, gradient backgrounds..." |
palette | Append only the hex colors: "Colors: #E8756D coral, #7ECFC0 mint..." |
image_generate returns a JSON blob with a URL ({"success": true, "image": "<url>"}). It does NOT save a local file, does NOT accept an output path, and does NOT let the agent pick a backend/model. Treat the URL as a temporary artifact and download it explicitly.
For each prompt file:
read_file) and extract the assembled promptASPECT to image_generate's enum: 16:9 → landscape, 9:16 → portrait, 1:1 → square. Custom ratios → nearest named aspect.image_generate(prompt=<assembled>, aspect_ratio=<enum>) and extract the image URL from the returned JSON.{output-dir}/NN-{type}-{slug}.png already exists, rename it via terminal (mv "{output-dir}/NN-{type}-{slug}.png" "{output-dir}/NN-{type}-{slug}-backup-YYYYMMDD-HHMMSS.png") before writing.terminal:
curl -sSL -o "{output-dir}/NN-{type}-{slug}.png" "{image_url}"
curl is unavailable, fall back to wget -qO "{output-dir}/NN-{type}-{slug}.png" "{image_url}".terminal: test -s "{path}" && echo ok).image_generate once. On download failure, retry curl once with a longer timeout. Then log and continue.Insert after the corresponding paragraph, using the path relative to the article file:
| Input | Insert Path |
|---|---|
Article file path (default imgs-subdir) |  |
| Article file path (images alongside) |  |
Article file path (illustrations/ subdirectory) |  |
| Pasted content |  (relative to cwd) |
Alt text: concise description in the article's language.
Article Illustration Complete!
Article: [path]
Type: [type] | Density: [level] | Style: [style]
Location: [directory]
Images: X/N generated
Positions:
- 01-xxx.png → After "[Section]"
- 02-yyy.png → After "[Section]"
[If failures]
Failed:
- NN-zzz.png: [reason]