Back to Openclaw

OpenClaw Pre-Release Plugin Testing

.agents/skills/openclaw-pre-release-plugin-testing/SKILL.md

2026.5.58.0 KB
Original Source

OpenClaw Pre-Release Plugin Testing

Use this skill when the user asks for plugin release confidence, plugin lifecycle sweeps, package-artifact plugin proof, or "what else should we test before release?" It complements openclaw-testing; use that skill too when choosing the cheapest safe runner or debugging a failing lane.

Goal

Prove the plugin system as a product surface, not just as source tests:

  • bundled plugin lifecycle: install, inspect, enable, disable, uninstall
  • package artifact behavior from a clean HOME
  • doctor/fix/config validation and idempotence
  • config discovery and config round-trip
  • status/log visibility and diagnostics
  • gateway startup/bootstrap with plugin metadata snapshots
  • public SDK compatibility for real external plugins
  • live-ish provider/channel probes only when safe credentials exist

First Checks

From the OpenClaw repo root:

bash
pnpm docs:list
git status --short --branch
readlink node_modules
pnpm changed:lanes --json

In Codex worktrees under .codex/worktrees, node_modules must be a symlink to the main OpenClaw checkout. Do not run pnpm install there. For broad or package-heavy proof, use Blacksmith Testbox or GitHub Actions.

Runner Choice

Prefer this order:

  1. GitHub Package Acceptance for installable-package product proof.
  2. ci-build-artifacts-testbox.yml Testbox when Docker/package lanes need seeded dist, dist-runtime, and package caches.
  3. ci-check-testbox.yml Testbox for source checks, targeted Vitest, package-boundary checks, or focused Docker lanes.
  4. Local targeted commands only for small format/static/unit probes.

Avoid long package Docker runs from a stale sparse worktree. If Testbox sync reports hundreds of changed files or starts deleting package inputs, stop and warm a fresh box from current main, or switch to Package Acceptance.

Existing Baseline

Run or verify these before inventing new coverage:

bash
OPENCLAW_TESTBOX=1 pnpm check:changed
pnpm run test:extensions:package-boundary:canary
pnpm run test:extensions:package-boundary:compile
pnpm test:docker:plugins
OPENCLAW_PLUGINS_E2E_CLAWHUB=0 pnpm test:docker:plugins
pnpm test:docker:plugin-update
pnpm test:docker:bundled-channel-deps:fast

For full bundled install/uninstall proof, shard the packaged sweep:

bash
OPENCLAW_BUNDLED_PLUGIN_SWEEP_TOTAL=8 \
OPENCLAW_BUNDLED_PLUGIN_SWEEP_INDEX=<0-7> \
pnpm test:docker:bundled-plugin-install-uninstall

Expected current packaged scope: 116 public bundled plugins over shards 0-7. Private QA plugins are source-mode only unless a package explicitly includes them.

Confidence Matrix

Use this matrix for pre-release signoff. Record pass/fail, run URL/Testbox ID, package SHA/version, and skipped-live reason.

SurfaceProofPreferred runner
Package artifactPackage Acceptance suite_profile=package or custom lanesGitHub Actions
Bundled lifecycle8-shard test:docker:bundled-plugin-install-uninstallTestbox or release Docker
External pluginstest:docker:plugins and plugins-offlineTestbox/package acceptance
Update no-optest:docker:plugin-updateTestbox/package acceptance
Channel runtime depstest:docker:bundled-channel-deps:fast plus key channelsTestbox/package acceptance
Doctor/fixseeded bad configs + doctor --fix --non-interactivenew Docker/Testbox harness
Config round-tripconfig set/get, inspect, doctor, reload, diff hashnew Docker/Testbox harness
Gateway bootstrapclean HOME, plugin groups enabled/disabled, status JSONnew Docker/Testbox harness
SDK compatibilitydirectory, tgz, and file: external plugins using SDK subpathstest:docker:plugins plus new smoke
Live-ishredacted provider/channel probes only for present envTestbox live lanes

Package Acceptance Plan

Use this when validating a release branch, beta, or candidate package:

bash
gh workflow run package-acceptance.yml \
  --repo openclaw/openclaw \
  --ref main \
  -f workflow_ref=main \
  -f source=ref \
  -f package_ref=<branch-or-sha> \
  -f suite_profile=custom \
  -f docker_lanes='plugins-offline plugin-update bundled-channel-deps-compat doctor-switch update-channel-switch config-reload mcp-channels npm-onboard-channel-agent' \
  -f telegram_mode=mock-openai

Use source=npm -f package_spec=openclaw@beta for published beta proof. Keep workflow_ref as trusted current harness code unless the release process says otherwise.

New Testbox Harness Plan

If more certainty is needed, add or run a plugin-lifecycle-matrix Docker lane that uses one package tarball and sharded plugin lists. Per plugin:

  1. Start with a clean HOME.
  2. Capture plugins list --json.
  3. plugins install <id>.
  4. plugins inspect <id> --json.
  5. plugins disable <id>, then assert disabled visibility.
  6. plugins enable <id>, except config-required plugins without config.
  7. plugins registry --refresh.
  8. doctor --non-interactive.
  9. plugins uninstall <id> --force.
  10. Assert no config entry, allow/deny residue, install record, managed dir, or bundled dist/extensions/... load path remains.
  11. Assert diagnostics contain no level: "error" and output redacts secret-looking values.

Keep memory-lancedb special: it is config-required. First assert install does not enable it without embedding config, then run a second configured case.

Doctor/Fix Matrix

Seed bad states and require doctor --fix --non-interactive to repair them, then run doctor again and require idempotence:

  • stale plugins.allow
  • stale plugins.entries
  • stale channel config for missing channel plugin
  • invalid plugins.entries.<id>.config
  • packaged bundled path in plugins.load.paths
  • legacy plugins.installs
  • disabled channel/plugin config that must not stage runtime deps
  • root-owned global package tree that must remain unmodified

Gateway Bootstrap Matrix

Start packaged OpenClaw in Docker with clean state:

  • provider plugins enabled, no credentials: ready with warnings, no crash
  • channel plugins configured disabled: no runtime deps staged
  • startup-activation plugins enabled: ready and reflected in status
  • invalid single plugin config: bad plugin skipped/quarantined, others remain

Assert:

  • gateway reaches ready
  • openclaw status --json includes plugin diagnostics
  • openclaw plugins inspect --all --json is parseable
  • package tree is not mutated
  • logs contain no raw tokens

Config Round-Trip Representatives

Use representative plugin families instead of every plugin for deep config round-trip:

  • providers: openai, anthropic, mistral, openrouter
  • channels: telegram, discord, slack, whatsapp
  • memory: memory-lancedb
  • feature/runtime: browser, acpx, tokenjuice

For each representative:

  1. Write config through CLI when possible.
  2. Read it back through config get or JSON.
  3. Run plugins inspect.
  4. Run doctor --non-interactive.
  5. Trigger gateway config reload if applicable.
  6. Compare config hash before/after no-op commands.

External SDK Smoke

In a package Docker lane, create tiny external plugins and install them from:

  • local directory
  • .tgz
  • file: npm spec

Cover CJS and ESM shapes, plus at least one plugin importing focused openclaw/plugin-sdk/* subpaths. Assert plugins inspect sees its tool, gateway method, CLI command, or service.

Live-Ish Probe Rules

Before live-ish work, source allowed env in Testbox and generate a redacted availability matrix: present/missing only, never values.

Only run probes for credentials that exist. Prefer auth/catalog/status probes over sending user-visible messages. If a probe might contact an external user, channel, or workspace, stop and ask the user.

Reporting

Report in this shape:

text
package/ref:
tbx ids / run urls:
matrix:
  bundled lifecycle:
  package acceptance:
  doctor/fix:
  gateway bootstrap:
  config round-trip:
  sdk external:
  live-ish:
failures:
skips:
next highest-value gap:

Say clearly when a failure is Testbox sync/env damage rather than product behavior, and prove that with a clean rerun or current-main comparison.