Back to Codeceptjs

CodeceptJS MCP Server

docs/mcp.md

4.0.019.8 KB
Original Source

CodeceptJS MCP Server

Model Context Protocol (MCP) server for CodeceptJS. Lets AI agents drive a CodeceptJS browser session — list tests, run arbitrary I.* code, pause-and-poke through a scenario, capture artifacts, and read aiTrace markdown — all in-process, sharing one browser and one container.

Overview

The MCP server exposes the following tools:

  • list_tests / list_actions — enumerate tests and I.* methods
  • start_browser / stop_browser — open / close the session (only place plugin overrides go)
  • run_code — run arbitrary JS with I and the full CodeceptJS scope; captures steps, console, return value, and a settled-state snapshot
  • snapshot — capture URL/HTML/ARIA/screenshot/console/storage at any moment
  • run_test — run a specific scenario; supports pauseAt for programmatic breakpoints
  • run_step_by_step — pause after every step
  • continue — release a paused test (run-to-end, run-to-next-pause, or run-to-finish)
  • cancel — abort the in-progress / paused run without closing the browser

Invocation

Two ways to launch the server:

  • npx codeceptjs-mcp — the published bin
  • node node_modules/codeceptjs/bin/mcp-server.js — direct path, useful for editor / agent configs

⚠️ Run from the project's local codeceptjs, never a global install. The MCP server resolves helpers, plugins, page objects, and custom support from the project's node_modules. A globally installed codeceptjs won't see project-local helpers (@codeceptjs/helper, @codeceptjs/configure, custom plugins) or your include: support objects, and per-project versions can drift from the global one. Always invoke via npx codeceptjs-mcp from inside the project directory, or point your MCP client config at <project>/node_modules/codeceptjs/bin/mcp-server.js directly.

Configuration

Set up the MCP server in your client (Claude Desktop, Cursor, Continue, etc.):

Basic

json
{
  "mcpServers": {
    "codeceptjs": {
      "command": "npx",
      "args": ["codeceptjs-mcp"]
    }
  }
}

The server looks for codecept.conf.js (then .cjs) in the current working directory.

With env vars

json
{
  "mcpServers": {
    "codeceptjs": {
      "command": "npx",
      "args": ["codeceptjs-mcp"],
      "env": {
        "CODECEPTJS_CONFIG": "/absolute/path/to/codecept.conf.js",
        "CODECEPTJS_PROJECT_DIR": "/absolute/path/to/project"
      }
    }
  }
}
VariableDescription
CODECEPTJS_CONFIGAbsolute path to codecept.conf.js. Overrides cwd lookup.
CODECEPTJS_PROJECT_DIRAbsolute path to the project root. Used as the resolution base for the config file.

Session Defaults

When the session starts, the MCP server enforces two plugin defaults so the agent gets useful telemetry out of the box:

  • aiTrace: { enabled: true, on: 'step' } — every step persists DOM/ARIA/screenshot/console artifacts to output/trace_<TestName>_<hash>/. Each scenario's traceFile is returned in run results so the agent can Read the markdown directly.
  • browser: { enabled: true, show: false } — headless. Switch to headed via start_browser plugins arg.

Both can be overridden (or disabled) via start_browser's plugins argument. The codecept.conf.js's own plugin config still merges in for keys the user explicitly set there.

Available Tools

start_browser

Initializes the session — loads config, builds the container, opens the browser, kicks off the synthetic test scope so run_code and snapshot work. This is the only tool that customizes initialization; every other tool either uses the active session or auto-inits with project defaults.

Parameters:

  • config (string, optional) — absolute path to codecept.conf.js. Defaults to $CODECEPTJS_CONFIG, then ./codecept.conf.js in $CODECEPTJS_PROJECT_DIR or cwd.
  • plugins (object, optional) — plugin configs keyed by name. Same shape as plugins in codecept.conf.js; enabled: true is added automatically. Most useful entries:
    • { browser: { show: true } } — visible browser
    • { browser: { browser: "firefox", windowSize: "1280x720" } } — switch browser + viewport
    • { aiTrace: { enabled: false } } — disable per-step trace overhead on a re-run
    • { pause: { on: "fail" } } / { screenshot: { on: "step" } } — any other plugin works the same way

Returns:

json
{
  "status": "Session started — run_code and snapshot are now available",
  "plugins": { "browser": { "show": false } }
}

stop_browser

Closes the browser handles, drops the synthetic test scope, but keeps the container, codecept, and Mocha alive. Subsequent start_browser reopens the browser without rebuilding everything — important because ESM-loaded test files don't re-execute their top-level Scenario(...) on reload, so a fresh Mocha would have no suites.

Parameters: none

Returns:

json
{ "status": "Browser stopped — Mocha and config preserved; call start_browser to reopen" }

cancel

Aborts the currently paused or in-progress test run without closing the browser. Use when you want to bail out of a paused test and start something else. Mocha + container stay alive; the next run_test / run_step_by_step works immediately.

Parameters: none

Returns:

json
{ "status": "Run cancelled — browser kept open" }

list_tests

Lists all tests resolved from the project's tests: glob.

Parameters: none

Returns:

json
{
  "count": 5,
  "tests": [
    { "file": "/abs/path/to/work_orders_test.js", "relativePath": "work_orders_test.js" }
  ]
}

list_actions

Lists every I.* method from enabled helpers and support objects.

Parameters: none

Returns:

json
{
  "count": 120,
  "actions": [
    { "helper": "Playwright", "action": "amOnPage", "signature": "I.amOnPage(url)" },
    { "helper": "SupportObject", "action": "loginAsAdmin", "signature": "I.loginAsAdmin()" }
  ]
}

run_code

Run arbitrary JavaScript inside the live test scope. Captures steps, console output, return value, and a final-state snapshot.

Parameters:

  • code (string, required) — JS source. Use await on I.* calls.
  • timeout (number, optional) — ms (default 60000).
  • saveArtifacts (boolean, optional) — capture final-state artifacts (default true).
  • settleMs (number, optional) — wait this many ms after the code finishes before capturing artifacts (default 300). Bump to 1000+ for slow re-renders, 0 to skip.

Scope (everything reachable as a bare identifier in code):

SymbolSource
IThe actor (with all helper methods)
Custom support objectsinclude: in codecept.conf.js (e.g. page objects, login from auth plugin)
locate, within, session, secret, inject, pause, sharefrom codeceptjs
tryTo, retryTo, hopeThatfrom codeceptjs/effects
stepfrom codeceptjs/steps
element, eachElement, expectElement, expectAnyElement, expectAllElementsfrom codeceptjs/els
containerthe DI container
helperslive helpers map (e.g. helpers.Playwright.page for raw Playwright access)

The full live list is returned in every response under availableObjects.

Return-value handling:

  • An explicit return X is JSON-stringified (with circular-ref handling). Capped at 20 KB.
  • If you forget return, the last grabbed step value is returned automatically (await I.grabTitle() on the last line works).
  • A returned WebElement (or array of them, from I.grabWebElement(s)) is auto-described to a plain object: { text, html, visible, enabled, attrs }.

Returns:

json
{
  "status": "success",
  "output": "Code executed successfully",
  "error": null,
  "commands": ["I am on page \"/\"", "I grab text from \"h1\""],
  "logs": [{ "level": "log", "message": "headline Welcome", "t": 47 }],
  "returnValue": "{\n  \"url\": \"http://localhost:8000/\",\n  \"text\": \"Welcome\"\n}",
  "availableObjects": ["I", "container", "eachElement", "element", "expectAllElements", "expectAnyElement", "expectElement", "helpers", "hopeThat", "inject", "locate", "login", "pause", "retryTo", "secret", "session", "share", "step", "tryTo", "within"],
  "artifacts": {
    "url": "http://localhost:8000/",
    "html": "file:///output/trace_run_code_.../mcp_page.html",
    "aria": "file:///output/trace_run_code_.../mcp_aria.txt",
    "screenshot": "file:///output/trace_run_code_.../mcp_screenshot.png",
    "console": "file:///output/trace_run_code_.../mcp_console.json",
    "storage": "file:///output/trace_run_code_.../mcp_storage.json",
    "cookieCount": 3,
    "localStorageCount": 5
  },
  "ariaDiff": "...",
  "dir": "/output/trace_run_code_...",
  "traceFile": "file:///output/trace_run_code_.../trace.md"
}
  • traceFile — markdown summary of this call. Read it for full context.
  • ariaDiff — present when the call mutated the page; diff between the previous aiTrace ARIA snapshot and the new one.
  • aiTraceHint — appears when aiTrace is disabled, suggesting how to re-enable it.

Example:

json
{
  "name": "run_code",
  "arguments": {
    "code": "await I.amOnPage('/'); const t = await I.grabTextFrom('h1'); return { url: await I.grabCurrentUrl(), text: t };"
  }
}

snapshot

Capture the current browser state without performing any action.

Parameters:

  • fullPage (boolean, optional) — full-page screenshot (default false).
  • settleMs (number, optional) — wait before capture (default 300).

Returns:

json
{
  "status": "success",
  "dir": "/output/snapshot_1700000000000_abcd1234",
  "traceFile": "file:///output/snapshot_.../trace.md",
  "artifacts": {
    "url": "http://localhost:8000/dashboard",
    "html": "file:///output/snapshot_.../snapshot_page.html",
    "aria": "file:///output/snapshot_.../snapshot_aria.txt",
    "screenshot": "file:///output/snapshot_.../snapshot_screenshot.png",
    "console": "file:///output/snapshot_.../snapshot_console.json",
    "storage": "file:///output/snapshot_.../snapshot_storage.json",
    "cookieCount": 3,
    "localStorageCount": 5
  }
}

run_test

Run a specific scenario. Returns reporter JSON with one entry per scenario; each entry has a traceFile (file:// URL) pointing to the per-scenario aiTrace markdown — Read it on failures to see the failing step's DOM/ARIA/screenshot.

If the test calls pause() — or if pauseAt matches a step — returns early with status: "paused" so the agent can inspect via run_code and release with continue (or abort with cancel).

Parameters:

  • test (string, required) — file path or partial test name; resolved to a single test file.
  • timeout (number, optional) — overall ms (default 60000).
  • grep (string, optional) — filter scenarios by title; passed to mocha.grep. Mirrors --grep on the CLI.
  • pauseAt (number | string, optional) — programmatic breakpoint. Either:
    • number — 1-based step index (test pauses after the Nth step completes)
    • string — case-insensitive substring match against step name
    • "/regex/i" — regex literal (the /.../i form is honored verbatim)

Returns (completed normally):

json
{
  "status": "completed",
  "file": "/path/to/test.js",
  "reporterJson": {
    "stats": { "tests": 1, "passes": 1, "failures": 0 },
    "tests": [
      {
        "title": "lists materials",
        "file": "/path/to/materials_test.js",
        "status": "passed",
        "duration": 4123,
        "traceFile": "file:///output/trace_materials__lists_materials_xxxx/trace.md"
      }
    ]
  },
  "error": null
}

Returns (paused):

json
{
  "status": "paused",
  "file": "/path/to/test.js",
  "pausedAfter": { "index": 7, "name": "I select option {\"css\":\"main select\"}, \"Flux\"", "status": "success" },
  "page": { "url": "https://app.example.com/materials", "title": "Materials", "contentSize": 18432 },
  "suggestions": [
    "Call snapshot to capture URL/HTML/ARIA/screenshot/console/storage at this point",
    "Call run_code to inspect or manipulate state (e.g. return await I.grabText(\"h1\"))",
    "Call continue to release the pause and let the test run the next step (or finish)"
  ]
}

Examples:

json
{ "name": "run_test", "arguments": { "test": "checkout_test", "pauseAt": 5 } }
{ "name": "run_test", "arguments": { "test": "checkout_test", "pauseAt": "fill field" } }
{ "name": "run_test", "arguments": { "test": "checkout_test", "pauseAt": "/grab.*url/i" } }

run_step_by_step

Run a test interactively, pausing after every step. The agent advances with continue or inspects with run_code / snapshot.

Parameters:

  • test (string, required)
  • timeout (number, optional)
  • grep (string, optional)
  • plugins (object, optional) — same as start_browser. Most useful is { browser: { show: true } } so you can watch the run between pauses.

Returns (after each step):

json
{
  "status": "paused",
  "file": "/path/to/test.js",
  "pausedAfter": { "index": 1, "name": "I am on page \"/\"", "status": "success" },
  "page": { "url": "http://localhost:8000/", "title": "Test App", "contentSize": 1832 },
  "suggestions": [...]
}

Returns (after the last step): same shape as run_test's completed response — every scenario carries its traceFile.

continue

Release a paused test. The test runs until the next pause (run_step_by_step), the next pause() call, or completion.

Parameters:

  • timeout (number, optional) — ms to wait for the next pause / completion (default 60000).

Returns (re-paused): same shape as run_test's paused response, with the new pausedAfter index.

Returns (completed): same shape as run_test's completed response.

Pause-and-poke flow

json
{ "name": "run_step_by_step", "arguments": { "test": "checkout_test" } }
// → { "status": "paused", "pausedAfter": { "index": 1, ... } }

{ "name": "snapshot", "arguments": {} }
// → full artifact bundle for step 1

{ "name": "run_code", "arguments": { "code": "return await I.grabCurrentUrl()" } }
// → { "status": "success", "returnValue": "http://...", "artifacts": { ... } }

{ "name": "run_code", "arguments": { "code": "await I.click('Save')" } }
// → { "status": "success", ... } — actually mutates the live page

{ "name": "continue", "arguments": {} }
// → { "status": "paused", "pausedAfter": { "index": 2, ... } }

// ... or bail out:
{ "name": "cancel", "arguments": {} }
// → { "status": "Run cancelled — browser kept open" }

Notes:

  • Pause runs in-process: run_code and the test share the same I / browser. There's no subprocess, no IPC.
  • run_test / run_step_by_step / continue silence stdout/stderr while running so step output doesn't interleave with the MCP JSON-RPC stream.
  • TTY behaviour is unchanged — npx codeceptjs run --debug at a terminal still opens the readline REPL when process.stdin.isTTY is true. The MCP server only intercepts pause when its handler is registered.

Trace files (aiTrace)

When aiTrace is on (the default for MCP sessions), every step in a scenario produces:

output/
└── trace_Materials__lists_materials_<hash>/
    ├── 0001_<step>_screenshot.png
    ├── 0001_<step>_page.html       # minified → trash classes/scripts/styles stripped → beautified
    ├── 0001_<step>_aria.txt        # Playwright only
    ├── 0001_<step>_console.json
    ├── 0002_...
    └── trace.md                    # AI-friendly markdown index

run_test / run_step_by_step results expose the trace.md URL per scenario (reporterJson.tests[].traceFile) — Read it on failure to see exactly what the failing step saw.

For ad-hoc run_code / snapshot runs, only a single set of artifacts is produced (mcp_* / snapshot_* prefix), packaged with their own trace.md.

trace.md shape

markdown
# Test: Login functionality

**Status**: failed
**File**: tests/login_test.js

## Steps

1. **I.amOnPage("/login")** — passed (150ms)
2. **I.fillField("#username", "user")** — passed (80ms)
3. **I.click("#login")** — passed (100ms)
4. **I.see("Welcome")** — failed (50ms)

## Error

Element "Welcome" not found

## Artifacts

- Screenshot: 0004_screenshot.png
- HTML: 0004_page.html
- ARIA: 0004_aria.txt

HTML formatting

Every HTML snapshot saved by the MCP server (and the aiTrace / pageInfo plugins, since they all funnel through captureSnapshot in lib/utils/trace.js) goes through:

  1. Minify (html-minifier-terser) — strip comments, collapse whitespace, drop redundant attributes.
  2. Clean — drop <style>, <noscript>, and inline <script> (no src); keep <script src="...">; strip trash class names (Tailwind utilities, framework hashes, xl:hidden-style scoped classes); drop style="..." attributes. Semantic attributes (id, aria-*, data-*, role, href, src, alt, title, name) are preserved.
  3. Beautify (js-beautify) — re-indent at 2 spaces; keep inline elements with their text.

Result: a multi-line, low-noise HTML doc that's far cheaper for an LLM to reason about than raw page source.

Storage state

For Playwright, captureSnapshot calls helper.grabStorageState(). For Puppeteer / WebDriver, it falls back to helper.grabCookie() plus an executeScript walking window.localStorage. Both produce the same shape ({ cookies: [...], origins: [{ origin, localStorage: [...] }] }).

Storage capture is enabled for run_code, snapshot, run_step_by_step fallback, and pageInfo. Disabled per-step in aiTrace — cookies / localStorage rarely change between actions, and per-step files would just be noise.

Architecture

  • In-process. No subprocess, no IPC. The MCP tool calls and the running test share one container, one helper, one browser.
  • Synthetic test scope. On first init the server emits suite.before + test.before and calls each helper's _beforeSuite + _before, so run_code / snapshot have a live helper.page to act on.
  • Mocha is reused. cleanReferencesAfterRun is forced to false (Mocha 11's constructor ignores the option, so the setter is called explicitly). stop_browser closes the browser but keeps Mocha alive — re-running run_test after start_browser works without ESM cache invalidation tricks.
  • Locking. run_test / run_step_by_step use a single-call lock so concurrent runs can't trample each other.

Troubleshooting

Server doesn't start

  • Node 18+ recommended.
  • Verify the path / npx resolution in your client config.

Config not found

  • Set CODECEPTJS_CONFIG to the absolute path of codecept.conf.js (or .cjs).
  • Set CODECEPTJS_PROJECT_DIR if your config lives outside cwd.

Tests not found

  • Confirm the project's tests: glob in codecept.conf.js matches your files.
  • list_tests runs from the same project — if it returns [], the config is the issue, not MCP.

Browser launch issues

  • Playwright requires its browsers installed (npx playwright install).
  • For visible runs use start_browser with plugins={ browser: { show: true } } — the default is headless.

Tests stuck or timing out

  • Bump timeout per call.
  • Check that the app under test is actually reachable.
  • For long re-renders that confuse snapshot / run_code's artifact capture, raise settleMs (default 300).

Security

  • The MCP server runs with the same permissions as the calling process.
  • run_code runs arbitrary JavaScript in the project context — only expose to trusted agents / environments.
  • Environment variables may contain absolute project paths; treat them like any other config.

Contributing

When changing the MCP server:

  1. Add coverage in test/mcp/mcp_server_test.js.
  2. Update this doc with new tools / parameters.
  3. Verify against a real project (e.g. the examples/playwright/ setup) — the in-process recorder + lifecycle integration is sensitive to ordering.
  4. Test with both Playwright and Puppeteer.

License

MIT