Back to Chatgpt On Wechat

Gemini

docs/en/models/gemini.mdx

2.0.91.6 KB
Original Source

Google Gemini supports text chat, image understanding, and image generation (Nano Banana series). A single gemini_api_key enables all capabilities.

<Tip> All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file. </Tip>

Text Chat

json
{
  "model": "gemini-3.5-flash",
  "gemini_api_key": "YOUR_API_KEY"
}
ParameterDescription
modelRecommended: gemini-3.5-flash; also supports gemini-3.1-pro-preview, gemini-3.1-flash-lite-preview, gemini-3-flash-preview, gemini-3-pro-preview, etc. See official docs
gemini_api_keyCreate one in Google AI Studio
gemini_api_baseOptional, defaults to https://generativelanguage.googleapis.com. Can be changed to a third-party proxy

Image Understanding

All Gemini models natively support vision. Once gemini_api_key is configured, the Agent's Vision tool automatically uses the main model to recognize images, with no extra setup required.

To manually specify a Vision model:

json
{
  "tools": {
    "vision": {
      "model": "gemini-3.1-flash-lite-preview"
    }
  }
}

Image Generation

json
{
  "skills": {
    "image-generation": {
      "model": "gemini-3.1-flash-image-preview"
    }
  }
}
Model IDAlias
gemini-3.1-flash-image-previewNano Banana 2
gemini-3-pro-image-previewNano Banana Pro
gemini-2.5-flash-imageNano Banana