Back to Chatdev

Agent Node

docs/user_guide/en/nodes/agent.md

2.2.06.2 KB
Original Source

Agent Node

The Agent node is the most fundamental node type in the DevAll platform, used to invoke Large Language Models (LLMs) for text generation, conversation, reasoning, and other tasks. It supports multiple model providers (OpenAI, Gemini, etc.) and can be configured with advanced features like tool calling, chain-of-thought, and memory.

Configuration

FieldTypeRequiredDefaultDescription
providerstringYesopenaiModel provider name, e.g., openai, gemini
namestringYes-Model name, e.g., gpt-4o, gemini-2.0-flash-001
roletextNo-System prompt
base_urlstringNoProvider defaultAPI endpoint URL, supports ${VAR} placeholders
api_keystringNo-API key, recommend using environment variable ${API_KEY}
paramsdictNo{}Model call parameters (temperature, top_p, etc.)
toolingobjectNo-Tool calling configuration, see Tooling Module
thinkingobjectNo-Chain-of-thought configuration, e.g., chain-of-thought, reflection
memorieslistNo[]Memory binding configuration, see Memory Module
skillsobjectNo-Agent Skills discovery and built-in skill activation/file-read tools
retryobjectNo-Automatic retry strategy configuration

Retry Strategy Configuration (retry)

FieldTypeDefaultDescription
enabledbooltrueWhether to enable automatic retry
max_attemptsint5Maximum number of attempts (including first attempt)
min_wait_secondsfloat1.0Minimum backoff wait time
max_wait_secondsfloat6.0Maximum backoff wait time
retry_on_status_codeslist[int][408,409,425,429,500,502,503,504]HTTP status codes that trigger retry

Agent Skills Configuration (skills)

FieldTypeDefaultDescription
enabledboolfalseEnable Agent Skills discovery for this node
allowlist[object][]Optional allowlist of skills from the project-level .agents/skills/ directory; each entry uses name

Agent Skills Notes

  • Skills are discovered from the fixed project-level .agents/skills/ directory.
  • The runtime exposes two built-in skill tools: activate_skill and read_skill_file.
  • read_skill_file only works after the relevant skill has been activated.
  • Skill SKILL.md frontmatter may include optional allowed-tools using the Agent Skills spec format, for example allowed-tools: execute_code.
  • If a selected skill requires tools that are not bound on the node, that skill is skipped at runtime.
  • If no compatible skills remain, the agent is explicitly instructed not to claim skill usage.

When to Use

  • Text generation: Writing, translation, summarization, Q&A, etc.
  • Intelligent conversation: Multi-turn dialogue, customer service bots
  • Tool calling: Enable the model to call external APIs or execute functions
  • Complex reasoning: Use with thinking configuration for deep thought
  • Knowledge retrieval: Use with memories to implement RAG patterns

Examples

Basic Configuration

yaml
nodes:
  - id: Writer
    type: agent
    config:
      provider: openai
      base_url: ${BASE_URL}
      api_key: ${API_KEY}
      name: gpt-4o
      role: |
        You are a professional technical documentation writer. Please answer questions in clear and concise language.
      params:
        temperature: 0.7
        max_tokens: 2000

Configuring Tool Calling

yaml
nodes:
  - id: Assistant
    type: agent
    config:
      provider: openai
      name: gpt-4o
      api_key: ${API_KEY}
      tooling:
        type: function  # Tool type: function, mcp_remote, mcp_local
        config:
          tools:  # List of function tools from functions/function_calling/ directory
            - name: describe_available_files
            - name: load_file
          timeout: 20  # Optional: execution timeout (seconds)

Configuring MCP Tools (Remote HTTP)

yaml
nodes:
  - id: MCP Agent
    type: agent
    config:
      provider: openai
      name: gpt-4o
      api_key: ${API_KEY}
      tooling:
        type: mcp_remote
        config:
          server: http://localhost:8080/mcp  # MCP server endpoint
          headers:  # Optional: custom request headers
            Authorization: Bearer ${MCP_TOKEN}
          timeout: 30  # Optional: request timeout (seconds)

Configuring MCP Tools (Local stdio)

yaml
nodes:
  - id: Local MCP Agent
    type: agent
    config:
      provider: openai
      name: gpt-4o
      api_key: ${API_KEY}
      tooling:
        type: mcp_local
        config:
          command: uvx  # Launch command
          args: ["mcp-server-sqlite", "--db-path", "data.db"]
          cwd: ${WORKSPACE}  # Optional, usually not needed
          env:  # Optional, usually not needed
            DEBUG: "true"
          startup_timeout: 10  # Optional: startup timeout (seconds)

Gemini Multimodal Configuration

yaml
nodes:
  - id: Vision Agent
    type: agent
    config:
      provider: gemini
      base_url: https://generativelanguage.googleapis.com
      api_key: ${GEMINI_API_KEY}
      name: gemini-2.5-flash-image
      role: You need to generate corresponding image content based on user input.

Configuring Retry Strategy

yaml
nodes:
  - id: Robust Agent
    type: agent
    config:
      provider: openai
      name: gpt-4o
      api_key: ${API_KEY}
      retry:  # Retry is enabled by default, you can customize it
        enabled: true
        max_attempts: 3
        min_wait_seconds: 2.0
        max_wait_seconds: 10.0

Configuring Agent Skills

yaml
nodes:
  - id: Skilled Agent
    type: agent
    config:
      provider: openai
      name: gpt-4o
      api_key: ${API_KEY}
      skills:
        enabled: true
        allow:
          - name: python-scratchpad
          - name: rest-api-caller