Spring AI Alibaba 多模态示例

图像理解、视觉 Agent、创意生成、语音合成

1. runImageFromUrl - 图片 URL 理解

通过公开图片 URL，使用 ChatModel 直接进行图像描述。

提交

上传本地图片文件，或使用 classpath 资源路径（如 images/sample.png）。

上传图片：提交（上传文件）

或使用 classpath 资源：

提交（classpath 资源）

使用 ReactAgent 进行多模态输入（图片 + 文本）理解。

提交

通过工具进行图像/音频生成（需配置 ImageModel 和 TTSModel）。

Generate an image of a cute cat in a garden, watercolor style.提交

使用 DashScope 语音合成（需配置 DashScopeAudioSpeechModel）。输出格式 base64 可在页面直接播放，url 为服务器临时文件地址。

Hello, this is a text to speech demo.输出格式：base64（页面内播放）url（服务器文件地址）合成