dotnet/website/tutorial/Image-chat-with-agent.md
This tutorial shows how to perform image chat with an agent using the @AutoGen.OpenAI.OpenAIChatAgent as an example.
[!NOTE] To chat image with an agent, the model behind the agent needs to support image input. Here is a partial list of models that support image input:
- gpt-4o
- gemini-1.5
- llava
- claude-3
- ...
In this example, we are using the gpt-4o model as the backend model for the agent.
[!NOTE] The complete code example can be found in Image_Chat_With_Agent.cs
First, install the AutoGen package using the following command:
dotnet add package AutoGen
[!code-csharpUsing Statements]
[!code-csharpCreate an OpenAIChatAgent]
In AutoGen, you can create an image message using either @AutoGen.Core.ImageMessage or @AutoGen.Core.MultiModalMessage. The @AutoGen.Core.ImageMessage takes a single image as input, whereas the @AutoGen.Core.MultiModalMessage allows you to pass multiple modalities like text or image.
Here is how to create an image message using @AutoGen.Core.ImageMessage: [!code-csharpCreate Image Message]
Here is how to create a multimodal message using @AutoGen.Core.MultiModalMessage: [!code-csharpCreate MultiModal Message]
To generate response, you can use one of the overloaded methods of @AutoGen.Core.AgentExtension.SendAsync* method. The following code shows how to generate response with an image message:
[!code-csharpGenerate Response]