Schema for Declarative Agent Format

Context and Problem Statement

This ADR describes a schema which can be used to define an Agent which can be loaded and executed using the Semantic Kernel Agent Framework.

Currently the Agent Framework uses a code first approach to allow Agents to be defined and executed. Using the schema defined by this ADR developers will be able to declaratively define an Agent and have the Semantic Kernel instantiate and execute the Agent.

Here is some pseudo code to illustrate what we need to be able to do:

csharp

Kernel kernel = Kernel
    .CreateBuilder()
    .AddAzureAIClientProvider(...)
    .Build();
var text =
    """
    type: azureai_agent
    name: AzureAIAgent
    description: AzureAIAgent Description
    instructions: AzureAIAgent Instructions
    model:
      id: gpt-4o-mini
    tools:
        - name: tool1
          type: code_interpreter
    """;

AzureAIAgentFactory factory = new();
var agent = await KernelAgentYaml.FromAgentYamlAsync(kernel, text, factory);

The above code represents the simplest case would work as follows:

The Kernel instance has the appropriate services e.g. an instance of AzureAIClientProvider when creating AzureAI agents.
The KernelAgentYaml.FromAgentYamlAsync will create one of the built-in Agent instances i.e., one of ChatCompletionAgent, OpenAIAssistantsAgent, AzureAIAgent.
The new Agent instance is initialized with it's own Kernel instance configured the services and tools it requires and a default initial state.

Note: Consider creating just plain Agent instances and extending the Agent abstraction to contain a method which allows the Agent instance to be invoked with user input.

csharp

Kernel kernel = ...
string text = EmbeddedResource.Read("MyAgent.yaml");
AgentFactory agentFactory = new AggregatorAgentFactory(
    new ChatCompletionAgentFactory(),
    new OpenAIAssistantAgentFactory(),
    new AzureAIAgentFactory());
var agent = KernelAgentYaml.FromAgentYamlAsync(kernel, text, factory);;

The above example shows how different Agent types are supported.

Note:

Markdown with YAML front-matter (i.e. Prompty format) will be the primary serialization format used.
Providing Agent state is not supported in the Agent Framework at present.
We need to decide if the Agent Framework should define an abstraction to allow any Agent to be invoked.
We will support JSON also as an out-of-the-box option.

Currently Semantic Kernel supports three Agent types and these have the following properties:

ChatCompletionAgent:
- Arguments: Optional arguments for the agent. (Inherited from ChatHistoryKernelAgent)
- Description: The description of the agent (optional). (Inherited from Agent)
- HistoryReducer: (Inherited from ChatHistoryKernelAgent)
- Id: The identifier of the agent (optional). (Inherited from Agent)
- Instructions: The instructions of the agent (optional). (Inherited from KernelAgent)
- Kernel: The Kernel containing services, plugins, and filters for use throughout the agent lifetime. (Inherited from KernelAgent)
- Logger: The ILogger associated with this Agent. (Inherited from Agent)
- LoggerFactory: A ILoggerFactory for this Agent. (Inherited from Agent)
- Name: The name of the agent (optional). (Inherited from Agent)
OpenAIAssistantAgent:
- Arguments: Optional arguments for the agent.
- Definition: The assistant definition.
- Description: The description of the agent (optional). (Inherited from Agent)
- Id: The identifier of the agent (optional). (Inherited from Agent)
- Instructions: The instructions of the agent (optional). (Inherited from KernelAgent)
- IsDeleted: Set when the assistant has been deleted via DeleteAsync(CancellationToken). An assistant removed by other means will result in an exception when invoked.
- Kernel: The Kernel containing services, plugins, and filters for use throughout the agent lifetime. (Inherited from KernelAgent)
- Logger: The ILogger associated with this Agent. (Inherited from Agent)
- LoggerFactory: A ILoggerFactory for this Agent. (Inherited from Agent)
- Name: The name of the agent (optional). (Inherited from Agent)
- PollingOptions: Defines polling behavior
AzureAIAgent
- Definition: The assistant definition.
- PollingOptions: Defines polling behavior for run processing.
- Description: The description of the agent (optional). (Inherited from Agent)
- Id: The identifier of the agent (optional). (Inherited from Agent)
- Instructions: The instructions of the agent (optional). (Inherited from KernelAgent)
- IsDeleted: Set when the assistant has been deleted via DeleteAsync(CancellationToken). An assistant removed by other means will result in an exception when invoked.
- Kernel: The Kernel containing services, plugins, and filters for use throughout the agent lifetime. (Inherited from KernelAgent)
- Logger: The ILogger associated with this Agent. (Inherited from Agent)
- LoggerFactory: A ILoggerFactory for this Agent. (Inherited from Agent)
- Name: The name of the agent (optional). (Inherited from Agent)

When executing an Agent that was defined declaratively some of the properties will be determined by the runtime:

Kernel: The runtime will be responsible for create the Kernel instance to be used by the Agent. This Kernel instance must be configured with the models and tools that the Agent requires.
Logger or LoggerFactory: The runtime will be responsible for providing a correctly configured Logger or LoggerFactory.
Functions: The runtime must be able to resolve any functions required by the Agent. E.g. the VSCode extension will provide a very basic runtime to allow developers to test Agents and it should be able to resolve KernelFunctions defined in the current project. See later in the ADR for an example of this.

For Agent properties that define behaviors e.g. HistoryReducer the Semantic Kernel SHOULD:

Provide implementations that can be configured declaratively i.e., for the most common scenarios we expect developers to encounter.
Allow implementations to be resolved from the Kernel e.g., as required services or possibly KernelFunction's.

Decision Drivers

Schema MUST be Agent Service agnostic i.e., will work with Agents targeting Azure, Open AI, Mistral AI, ...
Schema MUST allow model settings to be assigned to an Agent.
Schema MUST allow tools (e.g. functions, code interpreter, file search, ...) to be assigned to an Agent.
Schema MUST allow new types of tools to be defined for an Agent to use.
Schema MUST allow a Semantic Kernel prompt (including Prompty format) to be used to define the Agent instructions.
Schema MUST be extensible so that support for new Agent types with their own settings and tools, can be added to Semantic Kernel.
Schema MUST allow third parties to contribute new Agent types to Semantic Kernel.
…

The document will describe the following use cases:

Metadata about the agent and the file.
Creating an Agent with access to function tools and a set of instructions to guide it's behavior.
Allow templating of Agent instructions (and other properties).
Configuring the model and providing multiple model configurations.
Configuring data sources (context/knowledge) for the Agent to use.
Configuring additional tools for the Agent to use e.g. code interpreter, OpenAPI endpoints, .
Enabling additional modalities for the Agent e.g. speech.
Error conditions e.g. models or function tools not being available.

Out of Scope

This ADR does not cover the multi-agent declarative format or the process declarative format

Considered Options

Use the Declarative agent schema 1.2 for Microsoft 365 Copilot
Extend the Declarative agent schema 1.2 for Microsoft 365 Copilot
Extend the Semantic Kernel prompt schema

Pros and Cons of the Options

Use the Declarative agent schema 1.2 for Microsoft 365 Copilot

Semantic Kernel already has support this, see the declarative Agent concept sample.

Good, this is an existing standard adopted by the Microsoft 365 Copilot.
Neutral, the schema splits tools into two properties i.e. capabilities which includes code interpreter and actions which specifies an API plugin manifest.
Bad, because it does support different types of Agents.
Bad, because it doesn't provide a way to specific and configure the AI Model to associate with the Agent.
Bad, because it doesn't provide a way to use a Prompt Template for the Agent instructions.
Bad, because actions property is focussed on calling REST API's and cater for native and semantic functions.

Extend the Declarative agent schema 1.2 for Microsoft 365 Copilot

Some of the possible extensions include:

Agent instructions can be created using a Prompt Template.
Agent Model settings can be specified including fallbacks based on the available models.
Better definition of functions e.g. support for native and semantic.

Good, because {argument a}
Good, because {argument b}
Neutral, because {argument c}
Bad, because {argument d}
…

Extend the Semantic Kernel Prompt Schema

Good, because {argument a}
Good, because {argument b}
Neutral, because {argument c}
Bad, because {argument d}
…

Decision Outcome

Chosen option: "{title of option 1}", because {justification. e.g., only option, which meets k.o. criterion decision driver | which resolves force {force} | … | comes out best (see below)}.

Consequences

Good, because {positive consequence, e.g., improvement of one or more desired qualities, …}
Bad, because {negative consequence, e.g., compromising one or more desired qualities, …}
…

Validation

{describe how the implementation of/compliance with the ADR is validated. E.g., by a review or an ArchUnit test}

More Information

Code First versus Declarative Format

Below are examples showing the code first and equivalent declarative syntax for creating different types of Agents.

Consider the following use cases:

ChatCompletionAgent
ChatCompletionAgent using Prompt Template
ChatCompletionAgent with Function Calling
OpenAIAssistantAgent with Function Calling
OpenAIAssistantAgent with Tools

`ChatCompletionAgent`

Code first approach:

csharp

ChatCompletionAgent agent =
    new()
    {
        Name = "Parrot",
        Instructions = "Repeat the user message in the voice of a pirate and then end with a parrot sound.",
        Kernel = kernel,
    };

Declarative Semantic Kernel schema:

yml

type: chat_completion_agent
name: Parrot
instructions: Repeat the user message in the voice of a pirate and then end with a parrot sound.

Note:

ChatCompletionAgent could be the default agent type hence no explicit type property is required.

`ChatCompletionAgent` using Prompt Template

Code first approach:

csharp

string generateStoryYaml = EmbeddedResource.Read("GenerateStory.yaml");
PromptTemplateConfig templateConfig = KernelFunctionYaml.ToPromptTemplateConfig(generateStoryYaml);

ChatCompletionAgent agent =
    new(templateConfig, new KernelPromptTemplateFactory())
    {
        Kernel = this.CreateKernelWithChatCompletion(),
        Arguments = new KernelArguments()
        {
            { "topic", "Dog" },
            { "length", "3" },
        }
    };

Agent YAML points to another file, the Declarative Agent implementation in Semantic Kernel already uses this technique to load a separate instructions file.

Prompt template which is used to define the instructions.

yml

---
name: GenerateStory
description: A function that generates a story about a topic.  
template:
  format: semantic-kernel
  parser: semantic-kernel
inputs:
  - name: topic
    description: The topic of the story.
    is_required: true
    default: dog
  - name: length
    description: The number of sentences in the story.
    is_required: true
    default: 3
---
Tell a story about {{$topic}} that is {{$length}} sentences long.

Note: Semantic Kernel could load this file directly.

`ChatCompletionAgent` with Function Calling

Code first approach:

csharp

ChatCompletionAgent agent =
    new()
    {
        Instructions = "Answer questions about the menu.",
        Name = "RestaurantHost",
        Description = "This agent answers questions about the menu.",
        Kernel = kernel,
        Arguments = new KernelArguments(new OpenAIPromptExecutionSettings() { Temperature = 0.4, FunctionChoiceBehavior = FunctionChoiceBehavior.Auto() }),
    };

KernelPlugin plugin = KernelPluginFactory.CreateFromType<MenuPlugin>();
agent.Kernel.Plugins.Add(plugin);

Declarative using Semantic Kernel schema:

yml

---
name: RestaurantHost
name: RestaurantHost
description: This agent answers questions about the menu.
model:
  id: gpt-4o-mini
  options:
    temperature: 0.4
    function_choice_behavior:
      type: auto
      functions:
        - MenuPlugin.GetSpecials
        - MenuPlugin.GetItemPrice
---
Answer questions about the menu.

`OpenAIAssistantAgent` with Function Calling

Code first approach:

csharp

OpenAIAssistantAgent agent =
    await OpenAIAssistantAgent.CreateAsync(
        clientProvider: this.GetClientProvider(),
        definition: new OpenAIAssistantDefinition("gpt_4o")
        {
            Instructions = "Answer questions about the menu.",
            Name = "RestaurantHost",
            Metadata = new Dictionary<string, string> { { AssistantSampleMetadataKey, bool.TrueString } },
        },
        kernel: new Kernel());

KernelPlugin plugin = KernelPluginFactory.CreateFromType<MenuPlugin>();
agent.Kernel.Plugins.Add(plugin);

Declarative using Semantic Kernel schema:

Using the syntax below the assistant does not have the functions included in it's definition. The functions must be added to the Kernel instance associated with the Agent and will be passed when the Agent is invoked.

yml

---
name: RestaurantHost
type: openai_assistant
description: This agent answers questions about the menu.
model:
  id: gpt-4o-mini
  options:
    temperature: 0.4
    function_choice_behavior:
      type: auto
      functions:
        - MenuPlugin.GetSpecials
        - MenuPlugin.GetItemPrice
    metadata:
      sksample: true
---
Answer questions about the menu.
``

or

```yml
---
name: RestaurantHost
type: openai_assistant
description: This agent answers questions about the menu.
execution_settings:
  default:
    temperature: 0.4
tools:
  - type: function
    name: MenuPlugin-GetSpecials
    description: Provides a list of specials from the menu.
  - type: function
    name: MenuPlugin-GetItemPrice
    description: Provides the price of the requested menu item.
    parameters: '{"type":"object","properties":{"menuItem":{"type":"string","description":"The name of the menu item."}},"required":["menuItem"]}'
---
Answer questions about the menu.

Note: The Kernel instance used to create the Agent must have an instance of OpenAIClientProvider registered as a service.

`OpenAIAssistantAgent` with Tools

Code first approach:

csharp

OpenAIAssistantAgent agent =
    await OpenAIAssistantAgent.CreateAsync(
        clientProvider: this.GetClientProvider(),
        definition: new(this.Model)
        {
            Instructions = "You are an Agent that can write and execute code to answer questions.",
            Name = "Coder",
            EnableCodeInterpreter = true,
            EnableFileSearch = true,
            Metadata = new Dictionary<string, string> { { AssistantSampleMetadataKey, bool.TrueString } },
        },
        kernel: new Kernel());

Declarative using Semantic Kernel:

yml

---
name: Coder
type: openai_assistant
tools:
    - type: code_interpreter
    - type: file_search
---
You are an Agent that can write and execute code to answer questions.

Declarative Format Use Cases

Metadata about the agent and the file

yaml

name: RestaurantHost
type: azureai_agent
description: This agent answers questions about the menu.
version: 0.0.1

These are optional elements. Feel free to remove any of them.

Schema for Declarative Agent Format

Context and Problem Statement

Decision Drivers

Out of Scope

Considered Options

Pros and Cons of the Options

Use the Declarative agent schema 1.2 for Microsoft 365 Copilot

Extend the Declarative agent schema 1.2 for Microsoft 365 Copilot

Extend the Semantic Kernel Prompt Schema

Decision Outcome

Consequences

Validation

More Information

Code First versus Declarative Format

ChatCompletionAgent

ChatCompletionAgent using Prompt Template

ChatCompletionAgent with Function Calling

OpenAIAssistantAgent with Function Calling

OpenAIAssistantAgent with Tools

Declarative Format Use Cases

Metadata about the agent and the file

Creating an Agent with access to function tools and a set of instructions to guide it's behavior

Allow templating of Agent instructions (and other properties)

Configuring the model and providing multiple model configurations

Configuring data sources (context/knowledge) for the Agent to use

Configuring additional tools for the Agent to use e.g. code interpreter, OpenAPI endpoints

Enabling additional modalities for the Agent e.g. speech

Error conditions e.g. models or function tools not being available

`ChatCompletionAgent`

`ChatCompletionAgent` using Prompt Template

`ChatCompletionAgent` with Function Calling

`OpenAIAssistantAgent` with Function Calling

`OpenAIAssistantAgent` with Tools