Back to Eliza

Roadmap

packages/docs/roadmap.md

2.0.115.6 KB
Original Source

Eliza roadmap

Direction and rationale for Eliza on elizaOS. Not exhaustive; see the Changelog for shipped changes with WHYs.

Principles: energy and experience (desktop)

Goal: Beat the felt cost of always-on dev shells on laptop battery while staying more visually distinctive than a flat editor surface—better UX and DX, not a spec-sheet stunt.

  • Honest comparison: Cursor (and similar) ships a large persistent surface: Chromium/Electron-style shell, editor, extensions, LSP, indexing, often multiple web contexts. Eliza’s desktop UI is narrower and task-shaped: companion, chat, settings, and bridges—on Electrobun / WKWebView plus optional 3D. Total J/s is workload-dependent; apples-to-apples needs the same scenario and tools (Activity Monitor, powermetrics, Instruments). Our bar is excellent experience per watt for a local AI companion, not claiming victory in every head-to-head against a full IDE.
  • What we optimize: Wasted work—GPU and timers for hidden documents, off-screen canvases, and redundant HTTP polling. Battery-aware quality when unplugged (DPR cap, tighter Spark splats, no directional shadows, fewer background API ticks). Rich by default when the user is looking at the app and on AC.
  • Shipped levers (see changelog + desktop docs): VrmViewer visibility pause; desktop:getPowerStateVrmEngine.setLowPowerRenderMode; visibility-gated intervals (dashboard, stream, game logs, fine-tuning, cloud credits); vector 3D graph rAF pause when hidden; dev hooks opt-out so DX tooling does not accidentally burn watts (screenshot proxy, aggregated console).
  • Next UX/DX directions: User-visible Efficiency / Performance profile (single toggle), prefers-reduced-motion, optional idle frame cap for the avatar when motion fidelity matters less than battery, and clearer in-app copy when battery savings are active (so users trust the tradeoff).

Done (this cycle)

  • Dashboard SSE: action callbacks replace in place — In generateChatResponse, LLM onStreamChunk still appends token deltas; HandlerCallback text from actions uses replaceCallbackText: first callback freezes preCallbackText (streamed model output), each subsequent callback replaces only the suffix after that baseline via emitSnapshot / onSnapshot. Why: Matches Discord/Telegram progressive message UX (edit one bubble) without changing the elizaOS callback contract or adding a parallel WebSocket protocol. Docs: docs/runtime/action-callback-streaming.md, docs/changelog.mdx (2026-04-05). Code: packages/agent/src/api/chat-routes.ts.
  • Plugin load provenance + stagehand discoverycollectPluginNames() records the first reason each plugin entered the load set (plugins.allow, env auto-enable, features, etc.); resolvePlugins() includes (added by: …) when optional plugins fail to install so operators fix config/env instead of chasing phantom runtime bugs. Stagehand: findPluginBrowserStagehandDir() walks parents from the runtime file to find plugins/plugin-browser/stagehand-serverwhy fixed ../ depth failed for eliza/ submodule layouts. Docs: docs/plugin-resolution-and-node-path.md (optional plugins section), docs/guides/developer-diagnostics-and-workspace.md.
  • Life-ops PGlite migrations — Core CREATE INDEX statements run after ownership ALTER TABLE / column backfills so legacy DBs without domain / subject_* do not fail upgrades; runMigrationWithSavepoint uses explicit BEGIN/COMMIT so SAVEPOINT is valid under PGlite. Why: real databases hit migration errors during life-ops schema evolution. Tests: packages/agent/test/lifeops-pglite-schema.test.ts.
  • Workspace dependency scriptsfix-workspace-deps.mjs, replace-workspace-versions.mjs, restore-workspace-refs.mjs, workspace-prepare.mjs, and workspace-discovery.mjs reduce manual workspace surgery; root package.json exposes workspace:* / fix-deps aliases. Why: local ./eliza and plugins/* checkouts drift workspace:* and semver edges frequently. Docs: docs/guides/developer-diagnostics-and-workspace.md.
  • Terminal dev banners (TTY) — Framed settings tables + optional figlet headings + ANSI when stdout is a TTY (NO_COLOR / FORCE_COLOR respected). Why: four-process desktop dev needs scannable effective env for humans/agents — not product UI. Docs: docs/apps/desktop-local-development.md, docs/guides/developer-diagnostics-and-workspace.md.
  • Gitignore: cache/audio/, scripts/bin/* — Keeps large local media caches and optional binaries (e.g. yt-dlp) out of git; scripts/bin/.gitkeep preserves directory for PATH. Why: clones should not inherit multi-hundred-MB artifacts.
  • Electrobun / Vite: single three for Spark + VRMapps/app/vite.config.ts sparkPatchPlugin (resolveId + splatDefines hoist) and optimizeDeps.include for three + three/examples/jsm/* so @sparkjsdev/spark and the avatar stack share one THREE.ShaderChunk. Why: nested three (e.g. under Electrobun) caused splatDefines resolution failures and “multiple Three.js instances” warnings; resolve.alias alone broke Rollup prod. Docs: docs/apps/desktop-vrm-three-and-spark.md, docs/changelog.mdx.
  • VRM resilience — Lazy default VRM / DRACO paths, eliza-1 fallback instead of missing default assets, Spark/world failures isolated so VRM still loads. Why: bundled module-init timing and optional splat backgrounds must not brick the companion avatar. Code: VrmViewer.tsx, VrmEngine.ts, state/vrm.ts.
  • Cloud login persistcloud-routes.ts uses cloudDisconnectEpoch (increment on disconnect, snapshot before poll) instead of cloud.enabled === false to skip persist. Why: the old guard blocked first login when cloud had never been enabled. Docs: docs/apps/desktop-vrm-three-and-spark.md (API section).
  • OpenRouter plugin: pin broken npm alpha.12 — Root package.json pins @elizaos/plugin-openrouter to an exact known-good version (currently 2.0.0-alpha.13). Why: 2.0.0-alpha.12 published truncated dist/node and dist/browser ESM files: only utils/config is bundled, but exports still reference openrouterPlugin / default — Bun fails at load (symbol not declared). Why not patch dist in postinstall: the plugin implementation chunk is absent, not a one-line export typo; pinning is the correct mitigation until upstream republishes. Docs: docs/plugin-resolution-and-node-path.md (section Pinned: @elizaos/plugin-openrouter), docs/plugin-registry/llm/openrouter.md, docs/changelog.mdx, README.md. Code note: scripts/patch-deps.mjs (comment block next to other upstream workarounds).
  • Port collisions (dev + embedded desktop)dev:desktop / dev:desktop:watch pre-allocate free loopback ports for ELIZA_API_PORT and ELIZA_PORT (Vite) before spawning API, Vite, and Electrobun so env, proxy, and renderer URL stay aligned. Embedded agent: Electrobun picks the next free port from the preferred ELIZA_PORT instead of default lsof + SIGKILL; optional ELIZA_AGENT_RECLAIM_STALE_PORT=1 restores reclaim. Runtime: eliza.ts / dev-server.ts sync process.env to the API’s actual bind port where safe. UI: injectApiBase on agent status for main + all surface windows. Why: two Eliza stacks or stray processes should not require manual port hunting or killing unrelated processes; dynamic binds must propagate to renderer and dev tooling. Docs: docs/apps/desktop-local-development.md, docs/apps/desktop.md (port sections). Code: scripts/lib/allocate-loopback-port.mjs, apps/app/electrobun/src/native/loopback-port.ts, agent.ts, index.ts, surface-windows.ts, vite.config.ts, dev-server.ts, eliza.ts.
  • Desktop dev observability (IDEs / agents)GET /api/dev/stack, desktop:stack-status, default-on screenshot proxy (/api/dev/cursor-screenshot, loopback + token), default-on aggregated console (.eliza/desktop-dev-console.log + /api/dev/console-log tail with basename allow-list). Why: multi-process dev is opaque to tools that cannot see the native window; explicit HTTP + file hooks avoid guessing ports and keep loopback/tokens bounded. Opt-out env vars documented. Docs: docs/apps/desktop-local-development.md (section IDE and agent observability). Rules: .cursor/rules/eliza-desktop-dev-observability.mdc.
  • Electrobun Darwin → macOS mapping (WebGPU)getMacOSMajorVersion() uses Darwin − 9 for 20–24 (macOS 11–15) and Darwin + 1 for ≥ 25 (macOS 26+ Tahoe). Why: os.release() is Darwin; Tahoe is macOS 26 on Darwin 25—the old single formula reported 16 and broke WKWebView WebGPU messaging and gating. Docs: docs/apps/electrobun-darwin-macos-webgpu-version.md. Tests: webgpu-browser-support.test.ts.
  • Desktop menu reset (main process) — Confirm + API reset + restart + status poll run in Electrobun main; renderer syncs via menu-reset-eliza-applied and shared completeResetLocalStateAfterServerWipe. Why: WKWebView deferred renderer networking after native dialogs; users saw “nothing happens” after confirm. Reachable-base probe uses res.ok only. Docs: docs/apps/desktop-main-process-reset.md. Tests: menu-reset-from-main.test.ts, reset-main-process.test.ts.
  • Edge TTS disclosure — Document and surface ELIZA_DISABLE_EDGE_TTS (registry + docs/cli/environment.md + TTS doc). Why: orchestrator auto-loads Edge TTS → node-edge-tts → Microsoft; “no API key” is not “offline.”
  • Vitest app-core coverage — Root config globs packages/app-core/test/**/*.test.ts(x) and src/**/*.test.tsx; excludes app-core e2e under test/ from the default unit job. Why: new tests under test/state and test/runtime were skipped; a single hard-coded TSX path was brittle.
  • Node.js CI timeouts — Use actions/setup-node@v4 with check-latest: false everywhere; add Bun global cache and timeout-minutes to test, release, nightly, benchmark-tests, publish-npm. Why: avoid nodejs.org downloads and bounded job durations. See docs/build-and-release.md "Node.js and Bun in CI: WHYs".
  • Release workflow hardening — Strict shell (bash -euo pipefail) for fail-fast steps; retry loops for bun install with a final run so the step fails if all retries failed; crash dump uses the maintained ASAR CLI; find -print0 / while IFS= read -r -d '' for safe paths; DMG path via find+stat; node-gyp artifact removal before pack; size report includes eliza-dist; single Capacitor build step; packaged DMG E2E uses 240s CDP timeout in CI and dumps stdout/stderr on timeout. Why: Reproducible builds, clear failures, and debuggable CI; see docs/build-and-release.md "Release workflow: design and WHYs".
  • Plugin resolution (NODE_PATH) — Set NODE_PATH in three places so dynamic import("@elizaos/plugin-*") resolves from CLI (run-node.mjs child), direct eliza load (eliza.ts on load), and Electrobun (dev: walk up to find node_modules; packaged: ASAR node_modules). Why: Node does not search repo root when the entry is under dist/ or cwd is a subdir; without this, "Cannot find module" broke coding-agent and others. See docs/plugin-resolution-and-node-path.md.
  • Electrobun startup resilience — Keep API server up when runtime fails to load so the UI can show an error instead of "Failed to fetch". Why: A single missing native module (e.g. onnxruntime on Intel Mac) used to make the whole window dead with no explanation.
  • Intel Mac x64 DMG — Release workflow runs install and desktop build under arch -x86_64 for the macos-x64 artifact so native .node binaries are x64. Why: CI runs on arm64; without Rosetta we shipped arm64 binaries and Intel users got "Cannot find module .../darwin/x64/...".
  • Auto-derived plugin depscopy-electrobun-plugins-and-deps.mjs walks each @elizaos package's package.json dependencies instead of a curated list. Why: Curated lists missed new plugin deps and caused silent failures in packaged app; auto-walk stays correct as plugins change.
  • Regression tests for startup — E2E tests assert keep-server-alive and eliza.js load-failure behavior. Why: A failing test prevents removal of the exception-handling guards better than docs alone.
  • Plugin resolution fixNODE_PATH set to repo root node_modules in eliza.ts, run-node.mjs, and agent.ts (Electrobun dev). Why: Dynamic import("@elizaos/plugin-*") from bundled eliza.js couldn't resolve packages at root; NODE_PATH tells Node where to look. No-op in packaged app (existsSync guard). See docs/plugin-resolution-and-node-path.md.
  • Bun exports patch — Postinstall in patch-deps.mjs rewrites affected @elizaos plugins (and any similar package) so exports["."] no longer has "bun": "./src/index.ts" when that file doesn't exist. Why: The published tarball only ships dist/; Bun picks the "bun" condition first and fails. Removing the dead condition lets Bun use "import"./dist/index.js. See "Bun and published package exports" in docs/plugin-resolution-and-node-path.md.
  • Release size-report: SIGPIPE 141du | sort | head pipelines in the "Report packaged app size" step run in a subshell with || r=$? and allow exit 141; sort stderr silenced. Why: Under -euo pipefail, 141 would exit the step before we could allow it; subshell captures it. See docs/build-and-release.md.
  • NFA routes: optional plugin/api/nfa/status and /api/nfa/learnings lazy-load @elizaos/plugin-bnb-identity and fall back when missing. Why: Core and tests work without the plugin; ambient type declaration keeps typecheck happy.

Short-term / follow-ups

  • Action callbacks: If a plugin truly needs multiple independent assistant segments from one action turn (not progressive replace), we could add an optional callback flag or separate API — none required today. Why defer: Default matches Discord/Telegram; YAGNI until a concrete plugin asks.
  • OpenRouter: unpin when upstream fixes — When @elizaos/plugin-openrouter publishes a release after alpha.12 with verified full dist/node/index.node.js (and browser) bundles, relax the exact pin. Currently pinned to alpha.13. Why: Staying on a hard pin forever misses real fixes; we only avoid broken tarballs until npm has a good artifact.
  • Upstream plugin hygiene — Some plugins (e.g. @elizaos/plugin-discord) list typescript in dependencies instead of devDependencies; we skip it via DEP_SKIP to avoid bundle bloat. Why: Fixing upstream would reduce our skip list and keep plugin package.json correct.
  • Optional: filter bundled deps — We intentionally copy all transitive deps (including ones tsdown may have inlined) because plugins can dynamic-require at runtime. Why: Excluding "likely bundled" deps would risk "Cannot find module" in packaged app. If we ever get static analysis of plugin dist/ to know what is never required at runtime, we could shrink the copy; not a priority.

Longer-term

  • Desktop: Universal/fat macOS binary (single .app with arm64+x64) is possible via lipo or desktop packaging targets but adds build time and complexity; separate DMGs are acceptable for now.
  • CI: Consider caching desktop native rebuilds per arch to speed up release matrix.