docs/plans/2026-02-24-e2e-infrastructure.md
For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
Goal: Build a Python + Playwright E2E testing framework that exercises the IronClaw web gateway through a real browser against the real binary with a mock LLM backend.
Architecture: pytest session fixtures start a mock OpenAI-compat HTTP server and the ironclaw binary (libSQL in-memory, gateway enabled), then per-test Playwright browser instances navigate to the gateway and make DOM assertions.
Tech Stack: Python 3.11+, pytest, pytest-asyncio, playwright, aiohttp
Design doc: docs/plans/2026-02-24-e2e-infrastructure-design.md
Files:
tests/e2e/pyproject.tomltests/e2e/scenarios/__init__.pyStep 1: Create pyproject.toml
[project]
name = "ironclaw-e2e"
version = "0.1.0"
requires-python = ">=3.11"
dependencies = [
"pytest>=8.0",
"pytest-asyncio>=0.23",
"pytest-playwright>=0.5",
"playwright>=1.40",
"aiohttp>=3.9",
"httpx>=0.27",
]
[project.optional-dependencies]
vision = [
"anthropic>=0.40",
]
[tool.pytest.ini_options]
asyncio_mode = "auto"
timeout = 120
Step 2: Create empty init.py
Create tests/e2e/scenarios/__init__.py as an empty file.
Step 3: Verify install works
Run:
cd tests/e2e && pip install -e . && playwright install chromium
Expected: Clean install, no errors.
Step 4: Commit
git add tests/e2e/pyproject.toml tests/e2e/scenarios/__init__.py
git commit -m "scaffold: E2E test project with pyproject.toml"
Files:
tests/e2e/mock_llm.pyStep 1: Write the mock LLM server
The server must:
127.0.0.1 with a port passed via --port CLI arg (default 0 for OS-assigned)MOCK_LLM_PORT={port} to stdout on startup (for fixture to parse)POST /v1/chat/completions with both streaming and non-streaming modesGET /v1/models for health checksstream: true with proper SSE chunk format (critical for IronClaw's streaming)"""Mock OpenAI-compatible LLM server for E2E tests."""
import argparse
import json
import re
import time
import uuid
from aiohttp import web
CANNED_RESPONSES = [
(re.compile(r"hello|hi|hey", re.IGNORECASE), "Hello! How can I help you today?"),
(re.compile(r"2\s*\+\s*2|two plus two", re.IGNORECASE), "The answer is 4."),
(re.compile(r"skill|install", re.IGNORECASE), "I can help you with skills management."),
]
DEFAULT_RESPONSE = "I understand your request."
def match_response(messages: list[dict]) -> str:
"""Find canned response for the last user message."""
for msg in reversed(messages):
if msg.get("role") == "user":
content = msg.get("content", "")
# Handle content that may be a list (multi-modal)
if isinstance(content, list):
content = " ".join(
part.get("text", "") for part in content if part.get("type") == "text"
)
for pattern, response in CANNED_RESPONSES:
if pattern.search(content):
return response
return DEFAULT_RESPONSE
return DEFAULT_RESPONSE
async def chat_completions(request: web.Request) -> web.StreamResponse:
"""Handle POST /v1/chat/completions."""
body = await request.json()
messages = body.get("messages", [])
stream = body.get("stream", False)
response_text = match_response(messages)
completion_id = f"mock-{uuid.uuid4().hex[:8]}"
if not stream:
return web.json_response({
"id": completion_id,
"object": "chat.completion",
"created": int(time.time()),
"model": "mock-model",
"choices": [{
"index": 0,
"message": {"role": "assistant", "content": response_text},
"finish_reason": "stop",
}],
"usage": {"prompt_tokens": 10, "completion_tokens": len(response_text.split()), "total_tokens": 15},
})
# Streaming response: split into word-boundary chunks
resp = web.StreamResponse(
status=200,
headers={"Content-Type": "text/event-stream", "Cache-Control": "no-cache"},
)
await resp.prepare(request)
# First chunk: role
chunk = {
"id": completion_id,
"object": "chat.completion.chunk",
"created": int(time.time()),
"model": "mock-model",
"choices": [{"index": 0, "delta": {"role": "assistant", "content": ""}, "finish_reason": None}],
}
await resp.write(f"data: {json.dumps(chunk)}\n\n".encode())
# Content chunks: split on spaces
words = response_text.split(" ")
for i, word in enumerate(words):
text = word if i == 0 else f" {word}"
chunk["choices"][0]["delta"] = {"content": text}
await resp.write(f"data: {json.dumps(chunk)}\n\n".encode())
# Final chunk: finish_reason
chunk["choices"][0]["delta"] = {}
chunk["choices"][0]["finish_reason"] = "stop"
await resp.write(f"data: {json.dumps(chunk)}\n\n".encode())
await resp.write(b"data: [DONE]\n\n")
return resp
async def models(_request: web.Request) -> web.Response:
"""Handle GET /v1/models."""
return web.json_response({
"object": "list",
"data": [{"id": "mock-model", "object": "model", "owned_by": "test"}],
})
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--port", type=int, default=0)
args = parser.parse_args()
app = web.Application()
app.router.add_post("/v1/chat/completions", chat_completions)
app.router.add_get("/v1/models", models)
# Use aiohttp's runner to get the actual bound port
import asyncio
async def start():
runner = web.AppRunner(app)
await runner.setup()
site = web.TCPSite(runner, "127.0.0.1", args.port)
await site.start()
# Extract the actual port from the bound socket
port = site._server.sockets[0].getsockname()[1]
print(f"MOCK_LLM_PORT={port}", flush=True)
# Block forever
await asyncio.Event().wait()
asyncio.run(start())
if __name__ == "__main__":
main()
Step 2: Verify it starts and responds
Run:
python tests/e2e/mock_llm.py --port 18080 &
curl -s http://127.0.0.1:18080/v1/models | python -m json.tool
curl -s -X POST http://127.0.0.1:18080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"What is 2+2?"}],"model":"mock"}'
kill %1
Expected: Models endpoint returns {"data": [{"id": "mock-model", ...}]}. Chat returns response containing "4".
Step 3: Verify streaming
python tests/e2e/mock_llm.py --port 18080 &
curl -sN -X POST http://127.0.0.1:18080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"messages":[{"role":"user","content":"Hello"}],"model":"mock","stream":true}'
kill %1
Expected: SSE chunks ending with data: [DONE].
Step 4: Commit
git add tests/e2e/mock_llm.py
git commit -m "feat: mock OpenAI-compat LLM server for E2E tests"
Files:
tests/e2e/helpers.pyStep 1: Write helpers
"""Shared helpers for E2E tests."""
import asyncio
import re
import time
import httpx
# ── DOM Selectors ────────────────────────────────────────────────────────
# Keep all selectors in one place so changes to the frontend only need
# one update.
SEL = {
# Auth
"auth_screen": "#auth-screen",
"token_input": "#token-input",
# Connection
"sse_status": "#sse-status",
# Tabs
"tab_button": '.tab-bar button[data-tab="{tab}"]',
"tab_panel": "#tab-{tab}",
# Chat
"chat_input": "#chat-input",
"chat_messages": "#chat-messages",
"message_user": "#chat-messages .message.user",
"message_assistant": "#chat-messages .message.assistant",
# Skills
"skill_search_input": "#skill-search-input",
"skill_search_results": "#skill-search-results",
"skill_search_result": ".skill-search-result",
"skill_installed": "#installed-skills .ext-card",
}
TABS = ["chat", "memory", "jobs", "routines", "extensions", "skills"]
# Auth token used across all tests
AUTH_TOKEN = "e2e-test-token"
async def wait_for_ready(url: str, *, timeout: float = 60, interval: float = 0.5):
"""Poll a URL until it returns 200 or timeout."""
deadline = time.monotonic() + timeout
async with httpx.AsyncClient() as client:
while time.monotonic() < deadline:
try:
resp = await client.get(url, timeout=5)
if resp.status_code == 200:
return
except (httpx.ConnectError, httpx.ReadError, httpx.TimeoutException):
pass
await asyncio.sleep(interval)
raise TimeoutError(f"Service at {url} not ready after {timeout}s")
async def wait_for_port_line(process, pattern: str, *, timeout: float = 60) -> int:
"""Read process stdout line by line until a port-bearing line matches."""
deadline = time.monotonic() + timeout
while time.monotonic() < deadline:
remaining = deadline - time.monotonic()
if remaining <= 0:
break
try:
line = await asyncio.wait_for(process.stdout.readline(), timeout=remaining)
except asyncio.TimeoutError:
break
decoded = line.decode("utf-8", errors="replace").strip()
if match := re.search(pattern, decoded):
return int(match.group(1))
raise TimeoutError(f"Port pattern '{pattern}' not found in stdout after {timeout}s")
Step 2: Commit
git add tests/e2e/helpers.py
git commit -m "feat: E2E helpers with DOM selectors and port discovery"
Files:
tests/e2e/conftest.pyStep 1: Write the fixtures
Key details from codebase research:
Web UI: http://{host}:{port}/ to stdout (main.rs:508) using the config port, not the bound port. So we must use a fixed port, not port 0.GET /api/health (public, no auth required)?token= query parameter for the frontend auto-auth flow#auth-screen when token is valid and SSE connects"""pytest fixtures for E2E tests.
Session-scoped: build binary, start mock LLM, start ironclaw.
Function-scoped: fresh Playwright browser page per test.
"""
import asyncio
import os
import signal
import subprocess
import sys
from pathlib import Path
import pytest
from helpers import AUTH_TOKEN, wait_for_port_line, wait_for_ready
# Project root (two levels up from tests/e2e/)
ROOT = Path(__file__).resolve().parent.parent.parent
# Ports: use high fixed ports to avoid conflicts with development instances
MOCK_LLM_PORT = 18_199
GATEWAY_PORT = 18_200
@pytest.fixture(scope="session")
def ironclaw_binary():
"""Ensure ironclaw binary is built. Returns the binary path."""
binary = ROOT / "target" / "debug" / "ironclaw"
if not binary.exists():
print("Building ironclaw (this may take a while)...")
subprocess.run(
["cargo", "build", "--no-default-features", "--features", "libsql"],
cwd=ROOT,
check=True,
timeout=600,
)
assert binary.exists(), f"Binary not found at {binary}"
return str(binary)
@pytest.fixture(scope="session")
def event_loop():
"""Create a session-scoped event loop for async fixtures."""
loop = asyncio.new_event_loop()
yield loop
loop.close()
@pytest.fixture(scope="session")
async def mock_llm_server():
"""Start the mock LLM server. Yields the base URL."""
server_script = Path(__file__).parent / "mock_llm.py"
proc = await asyncio.create_subprocess_exec(
sys.executable, str(server_script), "--port", str(MOCK_LLM_PORT),
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
try:
port = await wait_for_port_line(proc, r"MOCK_LLM_PORT=(\d+)", timeout=10)
url = f"http://127.0.0.1:{port}"
await wait_for_ready(f"{url}/v1/models", timeout=10)
yield url
finally:
proc.send_signal(signal.SIGTERM)
try:
await asyncio.wait_for(proc.wait(), timeout=5)
except asyncio.TimeoutError:
proc.kill()
@pytest.fixture(scope="session")
async def ironclaw_server(ironclaw_binary, mock_llm_server):
"""Start the ironclaw gateway. Yields the base URL."""
env = {
**os.environ,
"RUST_LOG": "ironclaw=info",
"GATEWAY_ENABLED": "true",
"GATEWAY_HOST": "127.0.0.1",
"GATEWAY_PORT": str(GATEWAY_PORT),
"GATEWAY_AUTH_TOKEN": AUTH_TOKEN,
"GATEWAY_USER_ID": "e2e-tester",
"CLI_ENABLED": "false",
"LLM_BACKEND": "openai_compatible",
"LLM_BASE_URL": mock_llm_server,
"LLM_MODEL": "mock-model",
"DATABASE_BACKEND": "libsql",
"LIBSQL_PATH": ":memory:",
"SANDBOX_ENABLED": "false",
"SKILLS_ENABLED": "true",
"ROUTINES_ENABLED": "false",
"HEARTBEAT_ENABLED": "false",
"EMBEDDING_ENABLED": "false",
# Prevent onboarding wizard from triggering
"ONBOARD_COMPLETED": "true",
}
proc = await asyncio.create_subprocess_exec(
ironclaw_binary,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
env=env,
)
base_url = f"http://127.0.0.1:{GATEWAY_PORT}"
try:
await wait_for_ready(f"{base_url}/api/health", timeout=60)
yield base_url
finally:
proc.send_signal(signal.SIGTERM)
try:
await asyncio.wait_for(proc.wait(), timeout=5)
except asyncio.TimeoutError:
proc.kill()
@pytest.fixture
async def page(ironclaw_server):
"""Fresh Playwright browser page, navigated to the gateway with auth."""
from playwright.async_api import async_playwright
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
context = await browser.new_context(viewport={"width": 1280, "height": 720})
pg = await context.new_page()
await pg.goto(f"{ironclaw_server}/?token={AUTH_TOKEN}")
# Wait for the app to initialize (auth screen hidden, SSE connected)
await pg.wait_for_selector("#auth-screen", state="hidden", timeout=15000)
yield pg
await context.close()
await browser.close()
Step 2: Commit
git add tests/e2e/conftest.py
git commit -m "feat: E2E conftest with session fixtures for mock LLM and ironclaw"
Files:
tests/e2e/scenarios/test_connection.pyStep 1: Write the test
"""Scenario 1: Connection, auth, and tab navigation."""
import pytest
from helpers import AUTH_TOKEN, SEL, TABS
async def test_page_loads_and_connects(page):
"""After auth, the app shows Connected status and all tabs."""
# Connection status
status = page.locator(SEL["sse_status"])
await status.wait_for(state="visible", timeout=10000)
text = await status.text_content()
assert text is not None
assert "connect" in text.lower(), f"Expected 'Connected', got '{text}'"
# All 6 main tabs visible
for tab in TABS:
btn = page.locator(SEL["tab_button"].format(tab=tab))
assert await btn.is_visible(), f"Tab button '{tab}' not visible"
async def test_tab_navigation(page):
"""Clicking each tab shows its panel."""
for tab in TABS:
btn = page.locator(SEL["tab_button"].format(tab=tab))
await btn.click()
panel = page.locator(SEL["tab_panel"].format(tab=tab))
await panel.wait_for(state="visible", timeout=5000)
# Return to Chat tab
await page.locator(SEL["tab_button"].format(tab="chat")).click()
chat_input = page.locator(SEL["chat_input"])
await chat_input.wait_for(state="visible", timeout=5000)
async def test_auth_rejection(page, ironclaw_server):
"""Navigating without a token shows the auth screen."""
# Open a new page without the token
new_page = await page.context.new_page()
await new_page.goto(ironclaw_server)
auth_screen = new_page.locator(SEL["auth_screen"])
await auth_screen.wait_for(state="visible", timeout=10000)
await new_page.close()
Step 2: Verify test runs (may fail if ironclaw isn't built yet -- that's OK)
cd tests/e2e && python -m pytest scenarios/test_connection.py -v --timeout=120
Expected: Tests pass if ironclaw is built, or skip/fail gracefully if not.
Step 3: Commit
git add tests/e2e/scenarios/test_connection.py
git commit -m "feat: E2E scenario 1 -- connection and tab navigation tests"
Files:
tests/e2e/scenarios/test_chat.pyStep 1: Write the test
"""Scenario 2: Chat message round-trip via SSE streaming."""
import pytest
from helpers import SEL
async def test_send_message_and_receive_response(page):
"""Type a message, receive a streamed response from mock LLM."""
chat_input = page.locator(SEL["chat_input"])
await chat_input.wait_for(state="visible", timeout=5000)
# Send message
await chat_input.fill("What is 2+2?")
await chat_input.press("Enter")
# Wait for assistant response
assistant_msg = page.locator(SEL["message_assistant"]).last
await assistant_msg.wait_for(state="visible", timeout=15000)
# Verify user message
user_msgs = page.locator(SEL["message_user"])
assert await user_msgs.count() >= 1
last_user = user_msgs.last
user_text = await last_user.text_content()
assert "2+2" in user_text or "2 + 2" in user_text
# Verify assistant response contains "4" (from mock LLM canned response)
assistant_text = await assistant_msg.text_content()
assert "4" in assistant_text, f"Expected '4' in response, got: '{assistant_text}'"
async def test_multiple_messages(page):
"""Send two messages, verify both get responses."""
chat_input = page.locator(SEL["chat_input"])
await chat_input.wait_for(state="visible", timeout=5000)
# First message
await chat_input.fill("Hello")
await chat_input.press("Enter")
# Wait for first response
await page.locator(SEL["message_assistant"]).first.wait_for(
state="visible", timeout=15000
)
# Second message
await chat_input.fill("What is 2+2?")
await chat_input.press("Enter")
# Wait for second response (at least 2 assistant messages)
await page.wait_for_function(
"""() => document.querySelectorAll('#chat-messages .message.assistant').length >= 2""",
timeout=15000,
)
# Verify counts
user_count = await page.locator(SEL["message_user"]).count()
assistant_count = await page.locator(SEL["message_assistant"]).count()
assert user_count >= 2, f"Expected >= 2 user messages, got {user_count}"
assert assistant_count >= 2, f"Expected >= 2 assistant messages, got {assistant_count}"
async def test_empty_message_not_sent(page):
"""Pressing Enter with empty input should not create a message."""
chat_input = page.locator(SEL["chat_input"])
await chat_input.wait_for(state="visible", timeout=5000)
initial_count = await page.locator(f"{SEL['message_user']}, {SEL['message_assistant']}").count()
# Press Enter with empty input
await chat_input.press("Enter")
# Wait a moment and verify no new messages
await page.wait_for_timeout(2000)
final_count = await page.locator(f"{SEL['message_user']}, {SEL['message_assistant']}").count()
assert final_count == initial_count, "Empty message should not create new messages"
Step 2: Commit
git add tests/e2e/scenarios/test_chat.py
git commit -m "feat: E2E scenario 2 -- chat message round-trip tests"
Files:
tests/e2e/scenarios/test_skills.pyStep 1: Write the test
Note: These tests depend on ClawHub being reachable. They're marked with @pytest.mark.skipif if the registry is down.
"""Scenario 3: Skills search, install, and remove lifecycle."""
import pytest
from helpers import SEL
async def test_skills_tab_visible(page):
"""Skills tab shows the search interface."""
await page.locator(SEL["tab_button"].format(tab="skills")).click()
panel = page.locator(SEL["tab_panel"].format(tab="skills"))
await panel.wait_for(state="visible", timeout=5000)
search_input = page.locator(SEL["skill_search_input"])
assert await search_input.is_visible(), "Skills search input not visible"
async def test_skills_search(page):
"""Search ClawHub for skills and verify results appear."""
await page.locator(SEL["tab_button"].format(tab="skills")).click()
search_input = page.locator(SEL["skill_search_input"])
await search_input.fill("markdown")
await search_input.press("Enter")
# Wait for results (ClawHub may be slow)
try:
results = page.locator(SEL["skill_search_result"])
await results.first.wait_for(state="visible", timeout=20000)
except Exception:
pytest.skip("ClawHub registry unreachable or returned no results")
count = await results.count()
assert count >= 1, "Expected at least 1 search result"
async def test_skills_install_and_remove(page):
"""Install a skill from search results, then remove it."""
await page.locator(SEL["tab_button"].format(tab="skills")).click()
# Search
search_input = page.locator(SEL["skill_search_input"])
await search_input.fill("markdown")
await search_input.press("Enter")
try:
results = page.locator(SEL["skill_search_result"])
await results.first.wait_for(state="visible", timeout=20000)
except Exception:
pytest.skip("ClawHub registry unreachable or returned no results")
# Auto-accept confirm dialogs
await page.evaluate("window.confirm = () => true")
# Install first result
install_btn = results.first.locator("button", has_text="Install")
if await install_btn.count() == 0:
pytest.skip("No installable skills found in results")
await install_btn.click()
# Wait for install to complete (installed list updates)
# The UI should show the skill in the installed section
await page.wait_for_timeout(5000)
# Check if any installed skills exist now
installed = page.locator(SEL["skill_installed"])
installed_count = await installed.count()
if installed_count == 0:
# Try scrolling or waiting longer
await page.wait_for_timeout(5000)
installed_count = await installed.count()
assert installed_count >= 1, "Skill should appear in installed list after install"
# Remove the skill
remove_btn = installed.first.locator("button", has_text="Remove")
if await remove_btn.count() > 0:
await remove_btn.click()
await page.wait_for_timeout(3000)
# Verify removed
new_count = await page.locator(SEL["skill_installed"]).count()
assert new_count < installed_count, "Skill should be removed from installed list"
Step 2: Commit
git add tests/e2e/scenarios/test_skills.py
git commit -m "feat: E2E scenario 3 -- skills search, install, remove tests"
Files:
.github/workflows/e2e.ymlStep 1: Write the workflow
name: E2E Tests
on:
schedule:
- cron: "0 6 * * 1" # Weekly Monday 6 AM UTC
workflow_dispatch:
pull_request:
paths:
- "src/channels/web/**"
- "tests/e2e/**"
jobs:
e2e:
name: Browser E2E
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: actions/cache@v4
with:
path: |
target
~/.cargo/registry
key: e2e-${{ runner.os }}-${{ hashFiles('Cargo.lock') }}
- name: Build ironclaw (libsql)
run: cargo build --no-default-features --features libsql
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install E2E dependencies
run: |
cd tests/e2e
pip install -e .
playwright install --with-deps chromium
- name: Run E2E tests
run: pytest tests/e2e/ -v --timeout=120
- name: Upload screenshots on failure
if: failure()
uses: actions/upload-artifact@v4
with:
name: e2e-screenshots
path: tests/e2e/screenshots/
if-no-files-found: ignore
Step 2: Commit
git add .github/workflows/e2e.yml
git commit -m "ci: add weekly E2E test workflow with Playwright"
Files:
tests/e2e/README.mdStep 1: Write the README
# IronClaw E2E Tests
Browser-level end-to-end tests for the IronClaw web gateway using Python + Playwright.
## Prerequisites
- Python 3.11+
- Rust toolchain (for building ironclaw)
- Chromium (installed via Playwright)
## Setup
```bash
cd tests/e2e
pip install -e .
playwright install chromium
The tests need the ironclaw binary built with libsql support:
cargo build --no-default-features --features libsql
# From repo root
pytest tests/e2e/ -v
# Run a single scenario
pytest tests/e2e/scenarios/test_chat.py -v
# With visible browser (not headless)
HEADED=1 pytest tests/e2e/scenarios/test_connection.py -v
Tests start two subprocesses:
mock_llm.py) -- fake OpenAI-compat server with canned responsesThen Playwright drives a headless Chromium browser against the gateway, making DOM assertions.
| File | What it tests |
|---|---|
test_connection.py | Auth, tab navigation, connection status |
test_chat.py | Send message, SSE streaming, response rendering |
test_skills.py | ClawHub search, skill install/remove |
tests/e2e/scenarios/test_<name>.pypage fixture for a fresh browser pagehelpers.py (update SEL dict if new elements are needed)
**Step 2: Commit**
```bash
git add tests/e2e/README.md
git commit -m "docs: E2E test README with setup and usage instructions"
Step 1: Build ironclaw
cargo build --no-default-features --features libsql
Step 2: Run the full E2E suite
pytest tests/e2e/ -v --timeout=120
Expected: All tests in test_connection.py and test_chat.py pass. test_skills.py tests pass or skip (if ClawHub is unreachable).
Step 3: Fix any issues discovered during the run
Common issues to watch for:
MOCK_LLM_PORT or GATEWAY_PORT in conftest.pySEL dict in helpers.py if frontend elements changedONBOARD_COMPLETED=true prevents wizard from blockingStep 4: Final commit with any fixes
git add -A tests/e2e/
git commit -m "fix: E2E test adjustments from integration run"
| Task | Files | Description |
|---|---|---|
| 1 | pyproject.toml, init.py | Project scaffolding |
| 2 | mock_llm.py | Mock OpenAI-compat server |
| 3 | helpers.py | Selectors and utilities |
| 4 | conftest.py | pytest fixtures |
| 5 | test_connection.py | Scenario 1: connection/tabs |
| 6 | test_chat.py | Scenario 2: chat round-trip |
| 7 | test_skills.py | Scenario 3: skills lifecycle |
| 8 | e2e.yml | CI workflow |
| 9 | README.md | Documentation |
| 10 | (integration run) | Verify everything works |