@tarko/agent

Introduction

@tarko/agent is an event-stream driven meta agent framework designed for building efficient multimodal AI Agents. It provides complete Agent lifecycle management, tool integration, and multi-model support.

When to use?

This Agent SDK provides a low-level programmatic API, suitable for building AI agents from scratch:

MCP Agent: Connect to MCP clients and implement standardized AI tool protocols
GUI Agent: Build graphical interface agents that handle user interactions
Custom Agents: Build specialized agents for specific domains like code generation, data analysis, etc.

Unlike high-level frameworks, @tarko/agent gives you complete control to customize Agent behavior.

Architecture Overview

mermaid

flowchart TD
    A[User Input] --> B[Agent.run]
    B --> C[Message History]
    C --> D[LLM Request]
    D --> E{Tool Calls?}
    E -->|Yes| F[Tool Execution]
    E -->|No| G[Generate Response]
    F --> H[Tool Results]
    H --> I{Max Iterations?}
    I -->|No| D
    I -->|Yes| G
    G --> J[Event Stream]
    
    J --> K[UI Updates]
    J --> L[Logging]
    J --> M[Monitoring]
    
    style A fill:#e1f5fe
    style G fill:#c8e6c9
    style F fill:#fff3e0
    style J fill:#f3e5f5

Install

bash

npm install @tarko/agent

Core Features

Tool Integration - Effortlessly create and call tools within agent responses, supporting complex multi-step workflows.
Event-Stream Driven - Based on standard event stream protocols, real-time tracking of Agent state for efficient context and UI building.
Native Streaming - Native streaming transmission lets you understand the Agent's thinking process and output results in real time.
Multimodal Analysis - Automatically analyze multimodal tool results (images, text, files, etc.), letting you focus on business logic.
Strong Extension Capabilities - Rich lifecycle hook design allows you to implement more advanced Agent behaviors.
Multiple Model Providers - Supports OpenAI, Claude, Doubao and other models, with advanced configuration and runtime switching.
Multiple Tool Call Engines - native: OpenAI-compatible native function calling; prompt_engineering: Prompt-based tool calling; structured_outputs: JSON Schema structured outputs.

Quick Start

Create a index.ts:

import { Agent, Tool, z, LogLevel } from '@tarko/agent';

const locationTool = new Tool({
  id: 'getCurrentLocation',
  description: "Get user's current location",
  parameters: z.object({}),
  function: async () => {
    return { location: 'Boston' };
  },
});

const weatherTool = new Tool({
  id: 'getWeather',
  description: 'Get weather information for a specified location',
  parameters: z.object({
    location: z.string().describe('Location name, such as city name'),
  }),
  function: async (input) => {
    const { location } = input;
    return {
      location,
      temperature: '70°F (21°C)',
      condition: 'Sunny',
      precipitation: '10%',
      humidity: '45%',
      wind: '5 mph',
    };
  },
});

const agent = new Agent({
  model: {
    provider: 'openai',
    id: 'gpt-4o',
    apiKey: process.env.OPENAI_API_KEY!, // From environment variable
  },
  tools: [locationTool, weatherTool],
  instructions: 'You are a professional weather assistant capable of getting accurate location and weather information.',
  temperature: 0.7,
  maxIterations: 50,
});

async function main() {
  const response = await agent.run({
    input: "How's the weather today?",
  });
  console.log(response);
}

main();

Execute it:

bash

npx tsx index.ts

Output:

json

{
  "id": "5c38c0a1-ccbe-48f0-8b97-ae78a4d9407e",
  "type": "assistant_message",
  "timestamp": 1750188571248,
  "content": "The weather in Boston today is sunny with a temperature of 70°F (21°C). There's a 10% chance of precipitation, humidity is at 45%, and the wind is blowing at 5 mph.",
  "finishReason": "stop",
  "messageId": "msg_1750188570877_ics24k3x"
}

API

Agent

Define an Agent instance:

const agent = new Agent({
  /* AgentOptions */
});

Agent Options

Based on the actual AgentOptions interface from the source code, all options are optional:

Basic Configuration

id: Unique identifier for the agent instance (default: "@tarko/agent")
name: Agent name for tracking and logging (default: "Anonymous")
instructions: Agent system prompt, completely replaces default prompt (default: built-in intelligent assistant prompt)

Model Configuration

model: Model configuration object containing provider, id, apiKey, etc.
temperature: LLM temperature controlling output randomness (default: 0.7)
top_p: Nucleus sampling parameter controlling vocabulary selection diversity (default: model default)
maxTokens: Token limit per request (default: 1000)
thinking: Reasoning content control options

Tool Configuration

tools: Array of tools available to the agent
tool: Tool filtering options supporting include/exclude patterns
toolCallEngine: Tool call engine type (default: 'native')
- 'native': OpenAI-compatible native function calling
- 'prompt_engineering': Prompt-based tool calling
- 'structured_outputs': JSON Schema structured outputs

Execution Control

maxIterations: Maximum number of iterations (default: 1000)
context: Context awareness options like maxImagesCount

Debug and Monitoring

logLevel: Log level (LogLevel.DEBUG, LogLevel.INFO, etc.)
metric: Performance metrics collection configuration
enableStreamingToolCallEvents: Enable streaming tool call events (default: false)

Advanced Options

workspace: Working directory for filesystem operations
sandboxUrl: Sandbox environment URL
eventStreamOptions: Event stream processor configuration
initialEvents: Array of events to restore during initialization

Tool

Define a Tool instance:

import { Tool, z } from '@tarko/agent';

const locationTool = new Tool({
  id: 'getCurrentLocation',
  description: "Get user's current location",
  parameters: z.object({}),
  function: async () => {
    return { location: 'Boston' };
  },
});

Tool Options

id: Unique identifier for the tool
description: Description of what the tool does
parameters: Zod schema for tool parameters
function: Async function that implements the tool logic

Guide

Streaming Mode

Streaming mode allows you to monitor the Agent's execution process in real time, including thinking, tool calls, and response generation:

async function main() {
  const stream = await agent.run({
    input: "How's the weather today?",
    stream: true,
  });

  for await (const event of stream) {
    switch (event.type) {
      case 'assistant_streaming_message':
        process.stdout.write(event.content); // Real-time output
        break;
      case 'tool_call':
        console.log(`Calling tool: ${event.name}`);
        break;
      case 'tool_result':
        console.log(`Tool result: ${event.elapsedMs}ms`);
        break;
      case 'assistant_message':
        console.log(`\nComplete response: ${event.content}`);
        break;
    }
  }
}

Main Event Types

user_message: User input message
agent_run_start: Agent starts execution
assistant_streaming_message: Real-time streaming message chunks
tool_call: Tool call starts
tool_result: Tool execution results
assistant_message: Complete assistant message
agent_run_end: Agent execution ends

Streaming mode is particularly suitable for building real-time UI interfaces, letting users see the Agent's "thinking process".

Event Types

AssistantMessage

interface AssistantMessage {
  id: string;
  type: 'assistant_message';
  timestamp: number;
  content: string;
  toolCalls?: ChatCompletionMessageToolCall[];
  finishReason: 'stop' | 'tool_calls' | 'length';
  messageId: string;
}

ToolCall

interface ToolCallEvent {
  id: string;
  type: 'tool_call';
  timestamp: number;
  toolCallId: string;
  name: string;
  arguments: Record<string, any>;
  startTime: number;
  tool: {
    name: string;
    description: string;
    schema: any;
  };
}

ToolResult

interface ToolResult {
  id: string;
  type: 'tool_result';
  timestamp: number;
  toolCallId: string;
  name: string;
  content: any;
  elapsedMs: number;
}

StreamingMessage

interface StreamingMessage {
  id: string;
  type: 'assistant_streaming_message';
  timestamp: number;
  content: string;
  isComplete: boolean;
  messageId: string;
}

Utility Methods

Direct LLM Calls

Besides the complete Agent workflow, you can also directly call the currently configured LLM:

// Non-streaming call
const response = await agent.callLLM({
  messages: [
    { role: 'user', content: 'Hello' }
  ],
  temperature: 0.5,
});

// Streaming call
const stream = await agent.callLLM({
  messages: [
    { role: 'user', content: 'Write a poem' }
  ],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Get Available Tools

// Get all registered tools
const allTools = agent.getTools();

// Get tools processed through hooks
const availableTools = await agent.getAvailableTools();

console.log(`${availableTools.length} tools available`);

Generate Conversation Summary

const summary = await agent.generateSummary({
  messages: [
    { role: 'user', content: "How's the weather today?" },
    { role: 'assistant', content: 'Today is sunny with 22°C temperature.' },
  ],
});

console.log(summary.summary); // "Weather Query"

Execution Control

// Check Agent status
console.log(agent.status()); // 'idle' | 'running' | 'error'

// Get current iteration count
console.log(agent.getCurrentLoopIteration());

// Abort execution
if (agent.status() === 'running') {
  agent.abort();
}

// Resource cleanup
await agent.dispose();

Lifecycle Hooks

@tarko/agent provides rich hooks for customizing Agent behavior:

class CustomAgent extends Agent {
  async onBeforeToolCall(sessionId, toolCall, args) {
    console.log(`Preparing to call tool: ${toolCall.name}`);
    // Can modify arguments
    return { ...args, timestamp: Date.now() };
  }
  
  async onAfterToolCall(sessionId, toolCall, result) {
    console.log(`Tool call completed: ${toolCall.name}`);
    // Can modify result
    return result;
  }
  
  async onLLMRequest(sessionId, payload) {
    console.log(`Sending LLM request: ${payload.messages.length} messages`);
  }
}

Best Practices

Choose the Right Tool Call Engine

// For models supporting function calling (recommended)
const nativeAgent = new Agent({
  toolCallEngine: 'native', // OpenAI, Claude, etc.
});

// For models not supporting function calling
const promptAgent = new Agent({
  toolCallEngine: 'prompt_engineering', // Open source models
});

// For scenarios requiring strict structured output
const structuredAgent = new Agent({
  toolCallEngine: 'structured_outputs',
});

Tool Design Principles

// ✅ Good tool design
const goodTool = new Tool({
  id: 'searchWeb',
  description: 'Search for information on the web and return relevant results',
  parameters: z.object({
    query: z.string().describe('Search keywords'),
    limit: z.number().default(5).describe('Number of results to return'),
  }),
  function: async ({ query, limit }) => {
    // Implement search logic
    return { results: [], total: 0 };
  },
});

// ❌ Avoid this tool design
const badTool = new Tool({
  id: 'doEverything', // Too broad functionality
  description: 'Do anything', // Unclear description
  parameters: z.object({
    input: z.any(), // Unclear parameter type
  }),
  function: async (input) => {
    // Logic too complex
  },
});

Performance Optimization

const optimizedAgent = new Agent({
  // Limit context size
  context: {
    maxImagesCount: 3, // Avoid oversized context
  },
  
  // Reasonable iteration count
  maxIterations: 20, // Avoid infinite loops
  
  // Enable performance monitoring
  metric: {
    enable: true,
  },
  
  // Appropriate temperature setting
  temperature: 0.3, // Lower temperature recommended for production
});

Security Considerations

// Tool filtering
const safeAgent = new Agent({
  tools: [allTools],
  tool: {
    exclude: ['fileDelete', 'systemCommand'], // Exclude dangerous tools
  },
});

// Input validation
class SecureAgent extends Agent {
  async onBeforeToolCall(sessionId, toolCall, args) {
    // Validate tool parameters
    if (toolCall.name === 'fileRead' && args.path.includes('..')) {
      throw new Error('Path traversal attack detected');
    }
    return args;
  }
}

For more advanced usage patterns, see the Agent Hooks documentation.

Agent API

@tarko/agent

Introduction

When to use?

Architecture Overview

Install

Core Features

Quick Start

API

Agent

Agent Options

Basic Configuration

Model Configuration

Tool Configuration

Execution Control

Debug and Monitoring

Advanced Options

Tool

Tool Options

Guide

Streaming Mode

Main Event Types

Event Types

AssistantMessage

ToolCall

ToolResult

StreamingMessage

Utility Methods

Direct LLM Calls

Get Available Tools

Generate Conversation Summary

Execution Control

Lifecycle Hooks

Best Practices

Choose the Right Tool Call Engine

Tool Design Principles

Performance Optimization

Security Considerations