image-generation - Image Generation - Chatgpt On Wechat

A general-purpose image generation and editing skill supporting six providers: OpenAI, Gemini, Seedream (Volcengine Ark), Qwen (DashScope), MiniMax, and LinkAI. Configure any one provider's key to start using it; configure multiple to enable automatic fallback.

Supported Models

Provider	Models / Aliases	Notes
OpenAI	`gpt-image-2`, `gpt-image-1`	General-purpose, high quality, supports `quality` parameter
Gemini Nano Banana	`nano-banana-2`, `nano-banana-pro`, `nano-banana`	Corresponds to the image variants of `gemini-3.1-flash`, `gemini-3-pro`, `gemini-2.5-flash`
Seedream (Volcengine Ark)	`seedream-5.0-lite`, `seedream-4.5`	Native 2K–4K, up to 14 reference images for fusion
Qwen (DashScope)	`qwen-image-2.0`, `qwen-image-2.0-pro`	Strong with Chinese text rendering and text-image layouts
MiniMax	`image-01`	Fast and simple
LinkAI	Any model	Universal gateway, used as fallback

Model Selection

By default, "auto routing + automatic fallback" is used:

Pick the first configured provider in the order OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI
On errors such as 401, model not enabled, or network issues, automatically switch to the next provider
If the user specifies a model in the conversation (e.g. "use seedream to draw a cat"), the corresponding provider is promoted to the front

To pin a specific model:

json

{
  "skills": {
    "image-generation": {
      "model": "seedream-5.0-lite"
    }
  }
}

Configuring API Keys

<Tip> It is recommended to configure providers from the "Model Management" page in the [Web Console](/channels/web). Chat model keys configured there are automatically reused by the image generation skill — no need to set them twice. You can also edit the configuration file manually or temporarily set keys in a conversation using the `env_config` tool. </Tip>

Credentials are shared with the main model providers:

Field	Provider
`openai_api_key`	OpenAI
`gemini_api_key`	Gemini
`ark_api_key`	Volcengine Ark (Seedream)
`dashscope_api_key`	Alibaba DashScope (Qwen)
`minimax_api_key`	MiniMax
`linkai_api_key`	LinkAI

Enabling and Disabling

The skill automatically adjusts its status based on API keys:

Key configured: the Agent calls the skill directly when it receives a drawing request
Key not configured: the skill still appears in context (marked as "needs configuration") — the Agent will guide the user to set up a key

To control it manually:

text

/skill disable image-generation    # Disable
/skill enable image-generation     # Re-enable

Equivalent terminal commands: cow skill disable image-generation / cow skill enable image-generation.

Parameters

Parameter	Type	Required	Default	Description
`prompt`	string	Yes	—	Image description
`image_url`	string / list	No	null	Input image for editing — local path or URL; pass a list for multi-image fusion
`quality`	string	No	auto	`low` / `medium` / `high`, supported only by some providers
`size`	string	No	auto	`512` / `1K` / `2K` / `3K` / `4K`, or pixel value like `1024x1024`
`aspect_ratio`	string	No	null	`1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9`; Gemini also supports `1:4` / `4:1` / `1:8` / `8:1`

<Warning> **Higher quality and larger size cost more and take longer.** For everyday conversations, use the defaults (`auto`) or `quality=low` + `size=1K` — about 20 seconds per image. For posters or when high resolution is explicitly requested, use `quality=high` + `size=2K/4K` — may take 1–5 minutes. </Warning>

Common Use Cases

Text-to-image: generate illustrations, posters, icons, avatars, storyboards, etc. from a description
Image-to-image: change styles, swap elements, add decorations or text on an existing image
Multi-image fusion: combine multiple reference images into one (outfit swaps, character group photos, etc.)

<Note> - Bash timeout should be set to 600 seconds: each provider has a 300-second HTTP timeout, and the script may try multiple providers sequentially - Input images are automatically compressed to ≤ 4 MB with the longest edge ≤ 4096 px - Gemini / Seedream / Qwen / MiniMax do not support the `quality` parameter - Seedream defaults to 2K; `seedream-5.0-lite` supports up to 3K; `seedream-4.5` supports up to 4K </Note>