examples/slack/e2e/README.md
e2e/ — live end-to-end test harnessTrue end-to-end coverage for the Slack bridge: send real user messages in a real Slack workspace, sample the bot's reply while it's streaming, take screenshots in the middle of long streams, and verify what landed.
Why this exists. Unit tests (under
src/__tests__/) lock in the internal contracts of each module — they don't catch issues that only surface end-to-end: an open code fence leaking through the rest of the Slack message during streaming, a mrkdwn translation that looks right in tests but renders weird in Slack's actual client, a Block Kit limit we forgot about, a Bolt event that doesn't fire under some setting.The catalog at
e2e/cases.tsis the source of truth for what "feature-complete" means.
e2e/
├── README.md this
├── cases.ts catalog of test cases (technical axes; expand liberally)
├── slack-api.ts Slack Web API helpers (history, thread replies, sampling)
├── run.ts harness entrypoint — sends prompts, samples, screenshots
└── results/ per-run output: screenshots + JSON report
# from packages/slack/
# one-time: log into Slack once in the playwright browser profile.
# Subsequent runs reuse that profile.
pnpm exec playwright open --browser=chromium --user-data-dir=./e2e/.chrome-profile \
https://app.slack.com/client/T05QFA4BW9X/C0B49MEJ1HQ
# then:
pnpm e2e
The runner expects .env to already contain SLACK_BOT_TOKEN (used for
polling the channel history while the bot streams). Sending the user
message happens through the playwright-driven Slack UI using Atai's
session cookies from the persistent profile.
For each case the harness:
/agent slash command).conversations.replies (or .history for DMs / flat replies)
every sampleIntervalMs until maxWaitMs elapses.isBalanced(text))screenshots[i] offset (ms after send), takes a screenshot of
the Slack thread pane via playwright.results/<timestamp>/report.json and the screenshots.chat.update rate limits creating visible "jumps"Edit cases.ts. The bar is low — anything you'd want to see working in
Slack belongs in the catalog. Don't be afraid of duplication with
unit tests; the unit test proves the code is internally correct, the E2E
proves Slack actually renders it that way.