multimodal/tarko/agent-interface/README.md
Standard protocol, types, event stream and other specifications for @tarko/agent
npm install @tarko/agent-interface
The @tarko/agent-interface package provides the core types, interfaces, and event stream specifications for building intelligent agents in the @tarko/agent framework. It serves as the foundation for agent communication, tool integration, and real-time event processing.
IAgent)The core interface that all agent implementations must implement:
import { IAgent, AgentOptions } from '@tarko/agent-interface';
class MyAgent implements IAgent {
async initialize() {
// Initialize your agent
}
async run(input: string) {
// Execute agent logic
}
// ... other required methods
}
Comprehensive configuration options for agent behavior:
import { AgentOptions } from '@tarko/agent-interface';
const options: AgentOptions = {
// Base configuration
id: 'my-agent',
name: 'My Custom Agent',
instructions: 'You are a helpful assistant...',
// Model configuration
model: {
provider: 'openai',
id: 'gpt-4',
},
maxTokens: 4096,
temperature: 0.7,
// Tool configuration
tools: [
{
name: 'calculator',
description: 'Perform mathematical calculations',
schema: z.object({
expression: z.string(),
}),
function: async (args) => {
// Tool implementation
},
},
],
// Loop control
maxIterations: 50,
// Memory and context
context: {
maxImagesCount: 10,
},
eventStreamOptions: {
maxEvents: 1000,
autoTrim: true,
},
// Logging
logLevel: LogLevel.INFO,
};
Define and register tools for agent capabilities:
import { Tool } from '@tarko/agent-interface';
import { z } from 'zod';
const weatherTool: Tool = {
name: 'get_weather',
description: 'Get current weather for a location',
schema: z.object({
location: z.string().describe('The city and state/country'),
unit: z.enum(['celsius', 'fahrenheit']).default('celsius'),
}),
function: async ({ location, unit }) => {
// Fetch weather data
return {
location,
temperature: 22,
unit,
condition: 'sunny',
};
},
};
The event stream system provides real-time visibility into agent execution, conversation flow, and internal reasoning processes. It's designed for both monitoring and building reactive user interfaces.
The event stream supports various categories of events:
user_message - User input to the agentassistant_message - Agent's responseassistant_thinking_message - Agent's reasoning processassistant_streaming_message - Real-time content updatesassistant_streaming_thinking_message - Real-time reasoning updatesfinal_answer_streaming - Streaming final answerstool_call - Tool invocationtool_result - Tool execution resultplan_start - Beginning of planning sessionplan_update - Plan state changesplan_finish - Completion of plansystem - Logs, warnings, errorsagent_run_start - Agent execution startagent_run_end - Agent execution completionenvironment_input - External context injectionfinal_answer - Structured final responseimport { AgentEventStream } from '@tarko/agent-interface';
// Get the event stream from your agent
const eventStream = agent.getEventStream();
// Subscribe to all events
const unsubscribe = eventStream.subscribe((event) => {
console.log('Event:', event.type, event);
});
// Subscribe to specific event types
const unsubscribeSpecific = eventStream.subscribeToTypes(
['assistant_message', 'tool_call'],
(event) => {
console.log('Specific event:', event);
}
);
// Subscribe to streaming events only
const unsubscribeStreaming = eventStream.subscribeToStreamingEvents((event) => {
if (event.type === 'assistant_streaming_message') {
process.stdout.write(event.content);
}
});
// Create custom events
const customEvent = eventStream.createEvent('user_message', {
content: 'Hello, agent!',
});
// Send events manually
eventStream.sendEvent(customEvent);
// Query historical events
const recentEvents = eventStream.getEvents(['assistant_message'], 10);
const toolEvents = eventStream.getEventsByType(['tool_call', 'tool_result']);
// Get recent tool results
const toolResults = eventStream.getLatestToolResults();
// Run agent in streaming mode
const streamingEvents = await agent.run({
input: 'Analyze the weather data and create a report',
stream: true,
});
// Process streaming events
for await (const event of streamingEvents) {
switch (event.type) {
case 'assistant_streaming_message':
// Update UI with incremental content
updateMessageUI(event.messageId, event.content);
break;
case 'assistant_streaming_thinking_message':
// Show reasoning process
updateThinkingUI(event.content);
break;
case 'tool_call':
// Show tool being executed
showToolExecution(event.name, event.arguments);
break;
case 'final_answer':
// Display final structured answer
showFinalAnswer(event.content, event.format);
break;
}
}
Extend the event system with custom event types:
// Define custom event interface
interface MyCustomEventInterface extends AgentEventStream.BaseEvent {
type: 'custom_analysis';
analysisType: 'sentiment' | 'classification';
confidence: number;
result: any;
}
// Extend the event mapping through module augmentation
declare module '@tarko/agent-interface' {
namespace AgentEventStream {
interface ExtendedEventMapping {
custom_analysis: MyCustomEventInterface;
}
}
}
// Now you can use the custom event type
const customEvent = eventStream.createEvent('custom_analysis', {
analysisType: 'sentiment',
confidence: 0.95,
result: { sentiment: 'positive', score: 0.85 },
});
// Type-safe subscription
eventStream.subscribeToTypes(['custom_analysis'], (event) => {
// TypeScript knows this is MyCustomEventInterface
console.log('Analysis result:', event.result);
});
Configure event stream behavior:
const eventStreamOptions: AgentEventStream.ProcessorOptions = {
maxEvents: 1000, // Keep last 1000 events in memory
autoTrim: true, // Automatically remove old events
};
const agentOptions: AgentOptions = {
eventStreamOptions,
// ... other options
};
// Simple string input
const response = await agent.run('What is the weather in New York?');
console.log(response.content);
// Object-based options with configuration
const response = await agent.run({
input: [
{ type: 'text', text: 'Analyze this image:' },
{ type: 'image_url', image_url: { url: 'data:image/jpeg;base64,...' } },
],
model: 'gpt-4-vision-preview',
provider: 'openai',
sessionId: 'conversation-123',
toolCallEngine: 'native',
});
// Enable streaming for real-time updates
const events = await agent.run({
input: 'Create a detailed analysis report',
stream: true,
});
for await (const event of events) {
// Handle streaming events
}
Configure how tools are executed:
Uses LLM's built-in function calling capabilities:
const options: AgentOptions = {
toolCallEngine: 'native',
tools: [weatherTool],
};
Uses prompt-based tool calling for models without native function calling:
const options: AgentOptions = {
toolCallEngine: 'prompt_engineering',
tools: [weatherTool],
};
Uses structured JSON outputs for tool calling:
const options: AgentOptions = {
toolCallEngine: 'structured_outputs',
tools: [weatherTool],
};
Implement hooks to customize agent behavior:
class CustomAgent implements IAgent {
// Called before each LLM request
async onLLMRequest(sessionId: string, payload: LLMRequestHookPayload) {
console.log('Sending request to:', payload.provider);
}
// Called after LLM response
async onLLMResponse(sessionId: string, payload: LLMResponseHookPayload) {
console.log('Received response from:', payload.provider);
}
// Called before tool execution
async onBeforeToolCall(sessionId: string, toolCall: any, args: any) {
console.log('Executing tool:', toolCall.name);
return args; // Can modify arguments
}
// Called after tool execution
async onAfterToolCall(sessionId: string, toolCall: any, result: any) {
console.log('Tool result:', result);
return result; // Can modify result
}
// Called before loop termination
async onBeforeLoopTermination(sessionId: string, finalEvent: any) {
// Decide whether to continue or finish
return { finished: true };
}
// Called at start of each iteration
async onEachAgentLoopStart(sessionId: string) {
console.log('Starting new iteration');
}
// Called when agent loop ends
async onAgentLoopEnd(sessionId: string) {
console.log('Agent execution completed');
}
}
Configure how the agent manages conversation context:
const contextOptions: AgentContextAwarenessOptions = {
maxImagesCount: 5, // Limit images in context to prevent token overflow
};
const agentOptions: AgentOptions = {
context: contextOptions,
};
Handle tool execution errors:
class RobustAgent implements IAgent {
async onToolCallError(sessionId: string, toolCall: any, error: any) {
console.error('Tool execution failed:', error);
// Return a recovery value or re-throw
return {
error: true,
message: 'Tool execution failed, please try again',
};
}
}
The package is fully typed with TypeScript support:
maxEvents limits to prevent memory leaksmaxImagesCount for multimodal conversationsApache-2.0