Back to Semantic Kernel

Agent Framework - Assistant V2 Migration

docs/decisions/0049-agents-assistantsV2.md

latest8.4 KB
Original Source

Agent Framework - Assistant V2 Migration

Context and Problem Statement

Open AI has release the Assistants V2 API. This builds on top of the V1 assistant concept, but also invalidates certain V1 features. In addition, the dotnet API that supports Assistant V2 features is entirely divergent on the Azure.AI.OpenAI.Assistants SDK that is currently in use.

Open Issues

  • Streaming: To be addressed as a discrete feature

Design

Migrating to Assistant V2 API is a breaking change to the existing package due to:

  • Underlying capability differences (e.g. file-search vs retrieval)
  • Underlying V2 SDK is version incompatible with V1 (OpenAI and Azure.AI.OpenAI)

Agent Implementation

The OpenAIAssistant agent is roughly equivalent to its V1 form save for:

  • Supports options for assistant, thread, and run
  • Agent definition shifts to Definition property
  • Convenience methods for producing an OpenAI client

Previously, the agent definition as exposed via direct properties such as:

  • FileIds
  • Metadata

This has all been shifted and expanded upon via the Definition property which is of the same type (OpenAIAssistantDefinition) utilized to create and query an assistant.

<p align="center"> <kbd></kbd> </p>

The following table describes the purpose of diagramed methods on the OpenAIAssistantAgent.

Method NameDescription
CreateCreate a new assistant agent
ListDefinitionsList existing assistant definitions
RetrieveRetrieve an existing assistant
CreateThreadCreate an assistant thread
DeleteThreadDelete an assistant thread
AddChatMessageAdd a message to an assistant thread
GetThreadMessagesRetrieve all messages from an assistant thread
DeleteDelete the assistant agent's definition (puts agent into a terminal state)
InvokeInvoke the assistant agent (no chat)
GetChannelKeysInherited from Agent
CreateChannelInherited from Agent

Class Inventory

This section provides an overview / inventory of all the public surface area described in this ADR.

Class NameDescription
OpenAIAssistantAgentAn Agent based on the Open AI Assistant API
OpenAIAssistantChannelAn 'AgentChannel' for OpenAIAssistantAgent (associated with a thread-id.)
OpenAIAssistantDefinitionAll of the metadata / definition for an Open AI Assistant. Unable to use the Open AI API model due to implementation constraints (constructor not public).
OpenAIAssistantExecutionOptionsOptions that affect the run, but defined globally for the agent/assistant.
OpenAIAssistantInvocationOptionsOptions bound to a discrete run, used for direct (no chat) invocation.
OpenAIThreadCreationOptionsOptions for creating a thread that take precedence over assistant definition, when specified.
OpenAIServiceConfigurationDescribes the service connection and used to create the OpenAIClient

Run Processing

The heart of supporting an assistant agent is creating and processing a Run.

A Run is effectively a discrete assistant interaction on a Thread (or conversation).

This Run processing is implemented as internal logic within the OpenAI Agent Framework that is outlined here:

Initiate processing using:

  • agent -> OpenAIAssistantAgent
  • client -> AssistantClient
  • threadid -> string
  • options -> OpenAIAssistantInvocationOptions (optional)

Perform processing:

  • Verify agent not deleted

  • Define RunCreationOptions

  • Create the run (based on threadid and agent.Id)

  • Process the run:

    do

    • Poll run status until is not queued, in-progress, or cancelling

    • Throw if run status is expired, failed, or cancelled

    • Query steps for run

    • if run status is requires-action

      • process function steps

      • post function results

    • foreach (step is completed)

      • if (step is tool-call) generate and yield tool content

      • else if (step is message) generate and yield message content

    while (run status is not completed)

Vector Store Support

Vector Store support is required in order to enable usage of the file-search tool.

In alignment with V2 streaming of the FileClient, the caller may also directly target VectorStoreClient from the OpenAI SDK.

Definition / Options Classes

Specific configuration/options classes are introduced to support the ability to define assistant behavior at each of the supported articulation points (i.e. assistant, thread, & run).

ClassPurpose
OpenAIAssistantDefinitionDefinition of the assistant. Used when creating a new assistant, inspecting an assistant-agent instance, or querying assistant definitions.
OpenAIAssistantExecutionOptionsOptions that affect run execution, defined within assistant scope.
OpenAIAssistantInvocationOptionsRun level options that take precedence over assistant definition, when specified.
OpenAIAssistantToolCallBehaviorInforms tool-call behavior for the associated scope: assistant or run.
OpenAIThreadCreationOptionsThread scoped options that take precedence over assistant definition, when specified.
OpenAIServiceConfigurationInforms the which service to target, and how.

Assistant Definition

The OpenAIAssistantDefinition was previously used only when enumerating a list of stored agents. It has been evolved to also be used as input for creating and agent and exposed as a discrete property on the OpenAIAssistantAgent instance.

This includes optional ExecutionOptions which define default run behavior. Since these execution options are not part of the remote assistant definition, they are persisted in the assistant metadata for when an existing agent is retrieved. OpenAIAssistantToolCallBehavior is included as part of the execution options and modeled in alignment with the ToolCallBehavior associated with AI Connectors.

Note: Manual function calling isn't currently supported for OpenAIAssistantAgent or AgentChat and is planned to be addressed as an enhancement. When this supported is introduced, OpenAIAssistantToolCallBehavior will determine the function calling behavior (also in alignment with the ToolCallBehavior associated with AI Connectors).

Alternative (Future?)

A pending change has been authored that introduces FunctionChoiceBehavior as a property of the base / abstract PromptExecutionSettings. Once realized, it may make sense to evaluate integrating this pattern for OpenAIAssistantAgent. This may also imply in inheritance relationship of PromptExecutionSettings for both OpenAIAssistantExecutionOptions and OpenAIAssistantInvocationOptions (next section).

DECISION: Do not support tool_choice until the FunctionChoiceBehavior is realized.

<p align="center"> <kbd></kbd> </p>

Assistant Invocation Options

When invoking an OpenAIAssistantAgent directly (no-chat), definition that only apply to a discrete run may be specified. These definition are defined as OpenAIAssistantInvocationOptions and overtake precedence over any corresponding assistant or thread definition.

Note: These definition are also impacted by the ToolCallBehavior / FunctionChoiceBehavior quandary.

<p align="center"> <kbd></kbd> </p>

Thread Creation Options

When invoking an OpenAIAssistantAgent directly (no-chat), a thread must be explicitly managed. When doing so, thread specific options may be specified. These options are defined as OpenAIThreadCreationOptions and take precedence over any corresponding assistant definition.

<p align="center"> <kbd></kbd> </p>

Service Configuration

The OpenAIServiceConfiguration defines how to connect to a specific remote service, whether it be OpenAI, Azure, or proxy. This eliminates the need to define multiple overloads for each call site that results in a connection to the remote API service (i.e. create a client).

Note: This was previously named OpenAIAssistantConfiguration, but is not necessarily assistant specific.

<p align="center"> <kbd></kbd> </p>