MiniMax - Chatgpt On Wechat

MiniMax supports text chat, image understanding, image generation, and text-to-speech. A single minimax_api_key enables all capabilities.

<Tip> All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file. </Tip>

Text Chat

json

{
  "model": "MiniMax-M2.7",
  "minimax_api_key": "YOUR_API_KEY"
}

Parameter	Description
`model`	Can be `MiniMax-M2.7`, `MiniMax-M2.7-highspeed`, `MiniMax-M2.5`, `MiniMax-M2.1`, `MiniMax-M2.1-lightning`, `MiniMax-M2`, etc.
`minimax_api_key`	Create one in the MiniMax Console

Image Understanding

MiniMax's M2.x chat models do not support vision natively; vision calls are uniformly routed to MiniMax-Text-01. Once minimax_api_key is configured, the Agent's Vision tool automatically uses this model, with no need to specify it explicitly in the configuration file.

Image Generation

json

{
  "skills": {
    "image-generation": {
      "model": "image-01"
    }
  }
}

Available models: image-01.

Text-to-Speech (TTS)

json

{
  "text_to_voice": "minimax",
  "text_to_voice_model": "speech-2.8-hd",
  "tts_voice_id": "female-shaonv"
}

Parameter	Description
`text_to_voice_model`	`speech-2.8-hd` (emotional rendering, natural sound), `speech-2.8-turbo` (ultra-fast), `speech-2.6-hd`, `speech-2.6-turbo`
`tts_voice_id`	Voice ID; supports Chinese / Cantonese / English / Japanese / Korean — 70+ voices in total

Common voice examples:

Voice ID	Description
`female-shaonv`	Chinese · Young Girl (Female)
`female-yujie`	Chinese · Mature Lady (Female)
`female-tianmei`	Chinese · Sweet Female (Female)
`male-qn-jingying`	Chinese · Elite Youth (Male)
`male-qn-badao`	Chinese · Dominant Youth (Male)
`Cantonese_GentleLady`	Cantonese · Gentle Female Voice
`English_Graceful_Lady`	English · Graceful Lady

For the full voice list (70+ voices across Chinese / Cantonese / English / Japanese / Korean), see the system voice list, or select visually in the Web Console under "Model Management → Text-to-Speech".