docs/decisions/0041-function-call-content.md
Today, in SK, LLM function calling is supported exclusively by the OpenAI connector, and the function calling model is specific to that connector. At the time of writing the ARD, two new connectors are being added that support function calling, each with its own specific model for function calling. The design, in which each new connector introduces its own specific model class for function calling, does not scale well from the connector development perspective and does not allow for polymorphic use of connectors by SK consumer code.
Another scenario in which it would be beneficial to have an LLM/service-agnostic function calling model classes is to enable agents to pass function calls to one another. In this situation, an agent using the OpenAI Assistant API connector/LLM may pass the function call content/request/model for execution to another agent that build on top of the OpenAI chat completion API.
This ADR describes the high-level details of the service-agnostic function-calling model classes, while leaving the low-level details to the implementation phase. Additionally, this ADR outlines the identified options for various aspects of the design.
Requirements - https://github.com/microsoft/semantic-kernel/issues/5153
Today, SK relies on connector specific content classes to communicate LLM intent to call function(s) to the SK connector caller:
IChatCompletionService chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
ChatHistory chatHistory = new ChatHistory();
chatHistory.AddUserMessage("Given the current time of day and weather, what is the likely color of the sky in Boston?");
// The OpenAIChatMessageContent class is specific to OpenAI connectors - OpenAIChatCompletionService, AzureOpenAIChatCompletionService.
OpenAIChatMessageContent result = (OpenAIChatMessageContent)await chatCompletionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
// The ChatCompletionsFunctionToolCall belongs Azure.AI.OpenAI package that is OpenAI specific.
List<ChatCompletionsFunctionToolCall> toolCalls = result.ToolCalls.OfType<ChatCompletionsFunctionToolCall>().ToList();
chatHistory.Add(result);
foreach (ChatCompletionsFunctionToolCall toolCall in toolCalls)
{
string content = kernel.Plugins.TryGetFunctionAndArguments(toolCall, out KernelFunction? function, out KernelArguments? arguments) ?
JsonSerializer.Serialize((await function.InvokeAsync(kernel, arguments)).GetValue<object>()) :
"Unable to find function. Please try again!";
chatHistory.Add(new ChatMessageContent(
AuthorRole.Tool,
content,
metadata: new Dictionary<string, object?>(1) { { OpenAIChatMessageContent.ToolIdProperty, toolCall.Id } }));
}
Both OpenAIChatMessageContent and ChatCompletionsFunctionToolCall classes are OpenAI-specific and cannot be used by non-OpenAI connectors. Moreover, using the LLM vendor-specific classes complicates the connector's caller code and makes it impossible to work with connectors polymorphically - referencing a connector through the IChatCompletionService interface while being able to swap its implementations.
To address this issues, we need a mechanism that allows communication of LLM intent to call functions to the caller and returning function call results back to LLM in a service-agnostic manner. Additionally, this mechanism should be extensible enough to support potential multi-modal cases when LLM requests function calls and returns other content types in a single response.
Considering that the SK chat completion model classes already support multi-modal scenarios through the ChatMessageContent.Items collection, this collection can also be leveraged for function calling scenarios. Connectors would need to map LLM function calls to service-agnostic function content model classes and add them to the items collection. Meanwhile, connector callers would execute the functions and communicate the execution results back through the items collection as well.
A few options for the service-agnostic function content model classes are being considered below.
This option assumes having one service-agnostic model class - FunctionCallContent to communicate both function call and function result:
class FunctionCallContent : KernelContent
{
public string? Id {get; private set;}
public string? PluginName {get; private set;}
public string FunctionName {get; private set;}
public KernelArguments? Arguments {get; private set; }
public object?/FunctionResult/string? Result {get; private set;} // The type of the property is being described below.
public string GetFullyQualifiedName(string functionNameSeparator = "-") {...}
public Task<FunctionResult> InvokeAsync(Kernel kernel, CancellationToken cancellationToken = default)
{
// 1. Search for the plugin/function in kernel.Plugins collection.
// 2. Create KernelArguments by deserializing Arguments.
// 3. Invoke the function.
}
}
Pros:
Cons:
ChatMessageContent in the chat history, as the type itself does not convey its purpose.
This option proposes having two model classes - FunctionCallContent for communicating function calls to connector callers:
class FunctionCallContent : KernelContent
{
public string? Id {get;}
public string? PluginName {get;}
public string FunctionName {get;}
public KernelArguments? Arguments {get;}
public Exception? Exception {get; init;}
public Task<FunctionResultContent> InvokeAsync(Kernel kernel,CancellationToken cancellationToken = default)
{
// 1. Search for the plugin/function in kernel.Plugins collection.
// 2. Create KernelArguments by deserializing Arguments.
// 3. Invoke the function.
}
public static IEnumerable<FunctionCallContent> GetFunctionCalls(ChatMessageContent messageContent)
{
// Returns list of function calls provided via <see cref="ChatMessageContent.Items"/> collection.
}
}
and - FunctionResultContent for communicating function results back to connectors:
class FunctionResultContent : KernelContent
{
public string? Id {get; private set;}
public string? PluginName {get; private set;}
public string? FunctionName {get; private set;}
public object?/FunctionResult/string? Result {get; set;}
public ChatMessageContent ToChatMessage()
{
// Creates <see cref="ChatMessageContent"/> and adds the current instance of the class to the <see cref="ChatMessageContent.Items"/> collection.
}
}
Pros:
ChatMessageContent message.
Cons:
//The GetChatMessageContentAsync method returns only one choice. However, there is a GetChatMessageContentsAsync method that can return multiple choices.
ChatMessageContent messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
chatHistory.Add(messageContent); // Adding original chat message content containing function call(s) to the chat history
IEnumerable<FunctionCallContent> functionCalls = FunctionCallContent.GetFunctionCalls(messageContent); // Getting list of function calls.
// Alternatively: IEnumerable<FunctionCallContent> functionCalls = messageContent.Items.OfType<FunctionCallContent>();
// Iterating over the requested function calls and invoking them.
foreach (FunctionCallContent functionCall in functionCalls)
{
FunctionResultContent? result = null;
try
{
result = await functionCall.InvokeAsync(kernel); // Resolving the function call in the `Kernel.Plugins` collection and invoking it.
}
catch(Exception ex)
{
chatHistory.Add(new FunctionResultContent(functionCall, ex).ToChatMessage());
// or
//string message = "Error details that LLM can reason about.";
//chatHistory.Add(new FunctionResultContent(functionCall, message).ToChatMessageContent());
continue;
}
chatHistory.Add(result.ToChatMessage());
// or chatHistory.Add(new ChatMessageContent(AuthorRole.Tool, new ChatMessageContentItemCollection() { result }));
}
// Sending chat history containing function calls and function results to the LLM to get the final response
messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
The design does not require callers to create an instance of chat message for each function result content. Instead, it allows multiple instances of the function result content to be sent to the connector through a single instance of chat message:
ChatMessageContent messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
chatHistory.Add(messageContent); // Adding original chat message content containing function call(s) to the chat history.
IEnumerable<FunctionCallContent> functionCalls = FunctionCallContent.GetFunctionCalls(messageContent); // Getting list of function calls.
ChatMessageContentItemCollection items = new ChatMessageContentItemCollection();
// Iterating over the requested function calls and invoking them
foreach (FunctionCallContent functionCall in functionCalls)
{
FunctionResultContent result = await functionCall.InvokeAsync(kernel);
items.Add(result);
}
chatHistory.Add(new ChatMessageContent(AuthorRole.Tool, items);
// Sending chat history containing function calls and function results to the LLM to get the final response
messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
Option 1.2 was chosen due to its explicit nature.
Different chat completion connectors may communicate function calls to the caller and expect function results to be sent back via messages with a connector-specific role. For example, the {Azure}OpenAIChatCompletionService connectors use messages with an Assistant role to communicate function calls to the connector caller and expect the caller to return function results via messages with a Tool role.
The role of a function call message returned by a connector is not important to the caller, as the list of functions can easily be obtained by calling the GetFunctionCalls method, regardless of the role of the response message.
ChatMessageContent messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
IEnumerable<FunctionCallContent> functionCalls = FunctionCallContent.GetFunctionCalls(); // Will return list of function calls regardless of the role of the messageContent if the content contains the function calls.
However, having only one connector-agnostic role for messages to send the function result back to the connector is important for polymorphic usage of connectors. This would allow callers to write code like this:
...
IEnumerable<FunctionCallContent> functionCalls = FunctionCallContent.GetFunctionCalls();
foreach (FunctionCallContent functionCall in functionCalls)
{
FunctionResultContent result = await functionCall.InvokeAsync(kernel);
chatHistory.Add(result.ToChatMessage());
}
...
and avoid code like this:
IChatCompletionService chatCompletionService = new();
...
IEnumerable<FunctionCallContent> functionCalls = FunctionCallContent.GetFunctionCalls();
foreach (FunctionCallContent functionCall in functionCalls)
{
FunctionResultContent result = await functionCall.InvokeAsync(kernel);
// Using connector-specific roles instead of a single connector-agnostic one to send results back to the connector would prevent the polymorphic usage of connectors and force callers to write if/else blocks.
if(chatCompletionService is OpenAIChatCompletionService || chatCompletionService is AzureOpenAIChatCompletionService)
{
chatHistory.Add(new ChatMessageContent(AuthorRole.Tool, new ChatMessageContentItemCollection() { result });
}
else if(chatCompletionService is AnotherCompletionService)
{
chatHistory.Add(new ChatMessageContent(AuthorRole.Function, new ChatMessageContentItemCollection() { result });
}
else if(chatCompletionService is SomeOtherCompletionService)
{
chatHistory.Add(new ChatMessageContent(AuthorRole.ServiceSpecificRole, new ChatMessageContentItemCollection() { result });
}
}
...
It was decided to go with the AuthorRole.Tool role because it is well-known, and conceptually, it can represent function results as well as any other tools that SK will need to support in the future.
There are a few data types that can be used for the FunctionResultContent.Result property. The data type in question should allow the following scenarios:
So far, three potential data types have been identified: object, string, and FunctionResult.
class FunctionResultContent : KernelContent
{
// Other members are omitted
public object? Result {get; set;}
}
This option may require the use of JSON converters/resolvers for the {de}serialization of chat history, which contains function results represented by types not supported by JsonSerializer by default.
Pros:
Cons:
class FunctionResultContent : KernelContent
{
// Other members are omitted
public string? Result {get; set;}
}
Pros:
Cons:
class FunctionResultContent : KernelContent
{
// Other members are omitted
public FunctionResult? Result {get;set;}
public Exception? Exception {get;set}
or
public object? Error { get; set; } // Can contain either an instance of an Exception class or a string describing the problem.
}
Pros:
Cons:
FunctionResult is not {de}serializable today:
FunctionResult.ValueType property has a Type type that is not serializable by JsonSerializer by default, as it is considered dangerous.KernelReturnParameterMetadata.ParameterType and KernelParameterMetadata.ParameterType properties of type Type.FunctionResult.Function property is not deserializable and should be marked with the [JsonIgnore] attribute.
FunctionResult.Function property has to be nullable. It can be a breaking change? for the function filter users because the filters use FunctionFilterContext class that expose an instance of kernel function via the Function property.Note: This option was suggested during a second round of review of this ADR.
This option suggests making the FunctionResult class a derivative of the KernelContent class:
public class FunctionResult : KernelContent
{
....
}
So, instead of having a separate FunctionResultContent class to represent the function result content, the FunctionResult class will inherit from the KernelContent class, becoming the content itself. As a result, the function result returned by the KernelFunction.InvokeAsync method can be directly added to the ChatMessageContent.Items collection:
foreach (FunctionCallContent functionCall in functionCalls)
{
FunctionResult result = await functionCall.InvokeAsync(kernel);
chatHistory.Add(new ChatMessageContent(AuthorRole.Tool, new ChatMessageContentItemCollection { result }));
// instead of
chatHistory.Add(new ChatMessageContent(AuthorRole.Tool, new ChatMessageContentItemCollection { new FunctionResultContent(functionCall, result) }));
// of cause, the syntax can be simplified by having additional instance/extension methods
chatHistory.AddFunctionResultMessage(result); // Using the new AddFunctionResultMessage extension method of ChatHistory class
}
Questions:
FunctionCallContent to connectors along with the function result. It's actually not clear atm whether it's needed or not. The current rationale is that some models might expect properties of the original function call, such as arguments, to be passed back to the LLM along with the function result. An argument can be made that the original function call can be found in the chat history by the connector if needed. However, a counterargument is that it may not always be possible because the chat history might be truncated to save tokens, reduce hallucination, etc.Exception property the the FunctionResult class that will always be assigned by the KernelFunction.InvokeAsync method. However, this change will break C# function calling semantic, where the function should be executed if the contract is satisfied, or an exception should be thrown if the contract is not fulfilled.FunctionResult becomes a non-steaming content by inheriting KernelContent class, how the FunctionResult can represent streaming content capabilities represented by the StreamingKernelContent class when/if it needed later? C# does not support multiple inheritance.Pros
FunctionResult class becomes a content(non-streaming one) itself and can be passed to all the places where content is expected.FunctionResultContent class .Cons
FunctionResult and KernelContent classes might be a limiting factor preventing each one from evolving independently as they otherwise could.FunctionResult.Function property needs to be changed to nullable in order to be serializable, or custom serialization must be applied to {de}serialize the function schema without the function instance itself.Id property should be added to the FunctionResult class to represent the function ID required by LLMs.Originally, it was decided to go with Option 3.1 because it's the most flexible one comparing to the other two. In case a connector needs to get function schema, it can easily be obtained from kernel.Plugins collection available to the connector. The function result metadata can be passed to the connector through the KernelContent.Metadata property.
However, during the second round of review for this ADR, Option 3.4 was suggested for exploration. Finally, after prototyping Option 3.4, it was decided to return to Option 3.1 due to the cons of Option 3.4.
There are cases when LLM ignores data provided in the prompt due to the model's training. However, the model can work with the same data if it is provided to the model via a function result.
There are a few ways the simulated function can be modeled:
...
ChatMessageContent messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
// Simulated function call
FunctionCallContent simulatedFunctionCall = new FunctionCallContent(name: "weather-alert", id: "call_123");
messageContent.Items.Add(simulatedFunctionCall); // Adding a simulated function call to the connector response message
chatHistory.Add(messageContent);
// Creating SK function and invoking it
KernelFunction simulatedFunction = KernelFunctionFactory.CreateFromMethod(() => "A Tornado Watch has been issued, with potential for severe ..... Stay informed and follow safety instructions from authorities.");
FunctionResult simulatedFunctionResult = await simulatedFunction.InvokeAsync(kernel);
chatHistory.Add(new ChatMessageContent(AuthorRole.Tool, new ChatMessageContentItemCollection() { new FunctionResultContent(simulatedFunctionCall, simulatedFunctionResult) }));
messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
...
Pros:
Cons:
...
ChatMessageContent messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
// Simulated function
FunctionCallContent simulatedFunctionCall = new FunctionCallContent(name: "weather-alert", id: "call_123");
messageContent.Items.Add(simulatedFunctionCall);
chatHistory.Add(messageContent);
// Creating simulated result
string simulatedFunctionResult = "A Tornado Watch has been issued, with potential for severe ..... Stay informed and follow safety instructions from authorities."
//or
WeatherAlert simulatedFunctionResult = new WeatherAlert { Id = "34SD7RTYE4", Text = "A Tornado Watch has been issued, with potential for severe ..... Stay informed and follow safety instructions from authorities." };
chatHistory.Add(new ChatMessageContent(AuthorRole.Tool, new ChatMessageContentItemCollection() { new FunctionResultContent(simulatedFunctionCall, simulatedFunctionResult) }));
messageContent = await completionService.GetChatMessageContentAsync(chatHistory, settings, kernel);
...
Pros:
Cons:
The provided options are not mutually exclusive; each can be used depending on the scenario.
The design of a service-agnostic function calling model for connectors' streaming API should be similar to the non-streaming one described above.
The streaming API differs from a non-streaming one in that the content is returned in chunks rather than all at once. For instance, OpenAI connectors currently return function calls in two chunks: the function id and name come in the first chunk, while the function arguments are sent in subsequent chunks. Furthermore, LLM may stream function calls for more than one function in the same response. For example, the first chunk streamed by a connector may have the id and name of the first function, and the following chunk will have the id and name of the second function.
This will require slight deviations in the design of the function-calling model for the streaming API to more naturally accommodate the streaming specifics. In the case of a significant deviation, a separate ADR will be created to outline the details.