Back to Airi

AIRI Minecraft Service

services/minecraft/README.md

0.10.18.9 KB
Original Source

AIRI Minecraft Service

This workspace runs AIRI's dedicated Minecraft bot. It connects a Mineflayer runtime to a Minecraft server, loads the cognitive stack in src/cognitive, and bridges status, context, and command traffic back to AIRI so the Stage settings shell can observe the service.

Deprecation Notice

This service is on a deprecation path. The current Mineflayer-based bot is expected to be replaced by a Fabric mod based runtime, which will become the primary Minecraft integration surface going forward.

Use this service for current local development and maintenance, but avoid building new long-term features around the Mineflayer runtime unless they are also part of the migration plan.

Safety Notice

Do not connect this bot to public servers you do not trust.

The runtime can execute JavaScript-generated action plans to control the bot. Those scripts run in an isolated environment, but they still drive a real local process with access to your Minecraft session, local network reachability, and other machine-side resources. A malicious or hostile server can still cause unwanted actions, or damage to your system.

Treat this service as a local-development and trusted-server tool only.

Setup

  1. Install workspace dependencies from the repo root:

    bash
    pnpm i
    
  2. Copy the template:

    bash
    cp services/minecraft/.env services/minecraft/.env.local
    
  3. Edit services/minecraft/.env.local.

  4. Start the service:

    bash
    pnpm -F @proj-airi/minecraft-bot dev
    

    Or, from services/minecraft/:

    bash
    pnpm dev
    
  5. The bot should automatically connect to both AIRI and the Minecraft server.

Cognitive Architecture

AIRI's Minecraft agent is built on a four-layered cognitive architecture inspired by cognitive science, enabling reactive, conscious, and physically grounded behaviors.

Architecture Overview

mermaid
graph TB
    subgraph "Layer A: Perception"
        Events[Raw Events]
        EM[Event Manager]
        Events --> EM
    end

    subgraph "Layer B: Reflex (Subconscious)"
        RM[Reflex Manager]
        FSM[State Machine]
        RM --> FSM
    end

    subgraph "Layer C: Conscious (Reasoning)"
        ORC[Orchestrator]
        Planner[Planning Agent (LLM)]
        Chat[Chat Agent (LLM)]
        ORC --> Planner
        ORC --> Chat
    end

    subgraph "Layer D: Action (Execution)"
        TE[Task Executor]
        AA[Action Agent]
        Planner -->|Plan| TE
        TE -->|Action Steps| AA
    end

    EM -->|High Priority| RM
    EM -->|All Events| ORC
    RM -.->|Inhibition Signal| ORC
    ORC -->|Execution Request| TE

    style EM fill:#e1f5ff
    style RM fill:#fff4e1
    style ORC fill:#ffe1f5
    style TE fill:#dcedc8

Layer A: Perception

Location: src/cognitive/perception/

The perception layer acts as the sensory input hub, collecting raw Mineflayer signals and translating them into typed events/signals through an event registry + rule engine pipeline.

Pipeline:

  • Event definitions in events/definitions/* bind Mineflayer events to normalized raw events.
  • EventRegistry emits raw:<modality>:<kind> events to the cognitive event bus.
  • RuleEngine evaluates YAML rules and emits derived signal:* events consumed by Reflex/Conscious layers.

Key files:

  • events/index.ts
  • events/definitions/*
  • rules/engine.ts
  • rules/*.yaml
  • pipeline.ts

Layer B: Reflex

Location: src/cognitive/reflex/

The reflex layer handles immediate, instinctive reactions. It operates on a finite state machine (FSM) pattern for predictable, fast responses.

Components:

  • Reflex Manager (reflex-manager.ts): Coordinates reflex behaviors
  • Inhibition: Reflexes can inhibit Conscious layer processing to prevent redundant responses.

Layer C: Conscious

Location: src/cognitive/conscious/

The conscious layer handles complex reasoning, planning, and high-level decision-making. No physical execution happens here anymore.

Components:

  • Brain (brain.ts): Event queue orchestration, LLM turn lifecycle, safety/budget guards, debug REPL integration.
  • JavaScript Planner (js-planner.ts): Sandboxed planning/runtime execution against exposed tools/globals.
  • Query Runtime (query-dsl.ts): Read-only world/inventory/entity query helpers for planner scripts.
  • Task State (task-state.ts): Cancellation token and task lifecycle primitives used by action execution.

Layer D: Action

Location: src/cognitive/action/

The action layer is responsible for the actual execution of tasks in the world. It isolates "Doing" from "Thinking".

Components:

  • Task Executor (task-executor.ts): Runs normalized action instructions and emits action lifecycle events.
  • Action Registry (action-registry.ts): Validates params and dispatches tool calls.
  • Tool Catalog (llm-actions.ts): Action/tool definitions and schemas bound to mineflayer skills.

Event Flow Example

Scenario: "Build a house"

txt
Player: "build a house"
  ↓
[Perception] Event detected
  ↓
[Conscious] Architect plans the structure
  ↓
[Action] Executor takes the plan and manages the construction loop:
    - Step 1: Collect wood (calls ActionRegistry tool)
    - Step 2: Craft planks
    - Step 3: Build walls
  ↓
[Conscious] Brain confirms completion: "House is ready!"

Project Structure

txt
src/
ā”œā”€ā”€ airi/                      # AIRI bridge, module shell, status publishing
ā”œā”€ā”€ cognitive/                  # 🧠 Perception → Reflex → Conscious → Action
│   ā”œā”€ā”€ perception/            # Event definitions + rule evaluation
│   │   ā”œā”€ā”€ events/
│   │   │   ā”œā”€ā”€ index.ts
│   │   │   └── definitions/*
│   │   ā”œā”€ā”€ rules/
│   │   │   ā”œā”€ā”€ *.yaml
│   │   │   ā”œā”€ā”€ engine.ts
│   │   │   ā”œā”€ā”€ loader.ts
│   │   │   └── matcher.ts
│   │   └── pipeline.ts
│   ā”œā”€ā”€ reflex/                # Fast, rule-based reactions
│   │   ā”œā”€ā”€ reflex-manager.ts
│   │   ā”œā”€ā”€ runtime.ts
│   │   ā”œā”€ā”€ context.ts
│   │   └── behaviors/idle-gaze.ts
│   ā”œā”€ā”€ conscious/             # LLM-powered reasoning
│   │   ā”œā”€ā”€ brain.ts           # Core reasoning loop/orchestration
│   │   ā”œā”€ā”€ js-planner.ts      # JS planning sandbox
│   │   ā”œā”€ā”€ query-dsl.ts       # Read-only query runtime
│   │   ā”œā”€ā”€ llm-log.ts         # Turn/log query helpers
│   │   ā”œā”€ā”€ task-state.ts      # Task lifecycle enums/helpers
│   │   └── prompts/           # Prompt definitions (e.g., brain-prompt.ts)
│   ā”œā”€ā”€ action/                # Task execution layer
│   │   ā”œā”€ā”€ task-executor.ts   # Executes actions and emits lifecycle events
│   │   ā”œā”€ā”€ action-registry.ts # Tool dispatch + schema validation
│   │   ā”œā”€ā”€ llm-actions.ts     # Tool catalog
│   │   └── types.ts
│   ā”œā”€ā”€ event-bus.ts           # Event bus core
│   ā”œā”€ā”€ container.ts           # Dependency injection wiring
│   ā”œā”€ā”€ index.ts               # Cognitive system entrypoint
│   └── types.ts               # Shared cognitive types
ā”œā”€ā”€ composables/
│   ā”œā”€ā”€ config.ts              # Environment schema + defaults
│   ā”œā”€ā”€ runtime-config.ts      # Persisted local runtime config
│   └── bot.ts
ā”œā”€ā”€ debug/                     # Debug dashboard, MCP REPL, viewer integration
ā”œā”€ā”€ libs/
│   └── mineflayer/           # Mineflayer bot wrapper/adapters
ā”œā”€ā”€ skills/                   # Atomic bot capabilities
ā”œā”€ā”€ plugins/                  # Mineflayer/bot plugins
ā”œā”€ā”€ utils/                    # Helpers
ā”œā”€ā”€ minecraft-bot-runtime.ts  # Bot lifecycle wrapper for reconnect/reconfigure
└── main.ts                   # Bot entrypoint

Design Principles

  1. Separation of Concerns: Each layer has a distinct responsibility
  2. Event-Driven: Loose coupling via centralized event system
  3. Inhibition Control: Reflexes prevent unnecessary LLM calls
  4. Extensibility: Easy to add new reflexes or conscious behaviors
  5. Cognitive Realism: Mimics human-like perception → reaction → deliberation

Future Enhancements

  • Perception Layer:

    • ā±ļø Temporal context window (remember recent events)
    • šŸŽÆ Salience detection (filter noise, prioritize important events)
  • Reflex Layer:

    • šŸƒ Dodge hostile mobs
    • šŸ›”ļø Emergency combat responses
  • Conscious Layer:

    • šŸ’­ Emotional state management
    • 🧠 Long-term memory integration
    • šŸŽ­ Personality-driven responses

šŸ› ļø Development

Commands

  • pnpm dev - Start the bot in development mode
  • pnpm lint - Run ESLint
  • pnpm typecheck - Run TypeScript type checking
  • pnpm test - Run tests

šŸ™ Acknowledgements

šŸ¤ Contributing

Contributions are welcome! Please feel free to submit a Pull Request.