Write Tests

Tests are not production code. They are documentation — each one is a tiny executable spec that says "this system does X." A reader should grasp the intent in seconds. A failure should point to exactly one broken behavior, not leave you going through a 40-line test body.

Workflow

Follow these steps in order before writing any test code.

Decide the framework. Testing a function's return value, side effects, or module interactions → Vitest (lives under packages/<name>/test/). Testing that a real HTTP request to a running app produces the correct Sentry envelope → Playwright (lives under dev-packages/e2e-tests/test-applications/<app>/tests/).
Read 2–3 existing test files in the target test/ directory. Specifically note:
- Which vi.mock style they use (string path or import form)
- What cleanup they do in beforeEach (clearAllMocks vs restoreAllMocks)
- How they import the module under test (../../src/... vs @sentry/...)
- The describe/it nesting depth and naming style
- What setup functions are called together — does the function under test require companion initialization? (e.g., does patchRoute also need patchAppUse to work correctly?)
Match what you find. Consistency within a package matters more than idealized best practice.
Check for existing test utilities before writing helpers from scratch:
- packages/core/test/mocks/ — TestClient, getDefaultTestClientOptions, fake transports
- packages/core/test/testutils.ts — clearGlobalScope(), version gating
- dev-packages/test-utils/ — waitForTransaction, waitForError, waitForSession, getPlaywrightConfig, mock Sentry server, event proxy
- dev-packages/node-integration-tests/utils/ — createEsmAndCjsTests, assertion helpers
Identify the behaviors that matter most — edge cases, error paths, boundary conditions. Don't aim for quantity; aim for the tests that would catch real regressions.

Core principles

Fewer tests, better tests

The goal is not to maximize test count. A large suite of shallow happy-path tests gives a false sense of coverage — they pass on every change, including changes that introduce bugs. A smaller suite that targets edge cases, error paths, and boundary conditions catches far more regressions.

Before writing a test, ask: "If this test didn't exist, what bug could ship?" If you can't answer that concretely, the test probably isn't worth writing. Prioritize:

Edge cases and boundaries — the off-by-one, the empty array, the null input
Error paths — does the function fail gracefully or silently swallow the error?
Integration seams — where two modules or systems interact (e.g., middleware calling next())
Behavior that previously broke — regression tests for known bugs

Don't waste tests on: trivial getters/setters, pure delegation to well-tested libraries, TypeScript type constraints (the compiler already checks those), or re-testing the same behavior that a higher-level test already covers.

Arrange → Act → Assert

Structure every test with the AAA pattern, separated by blank lines. The whitespace makes the phases obvious — no labels or comments needed.

typescript

it('skips errors already captured by middleware', () => {
  const error = new Error('already captured');
  Object.defineProperty(error, '__sentry_captured__', { value: true });

  responseHandler(createMockContext(500, error));

  expect(mockCaptureException).not.toHaveBeenCalled();
});

One behavior, one reason to fail

Each test makes exactly one behavioral claim. Multiple expect calls are fine when they assert on different facets of the same outcome. But if you're checking two unrelated behaviors, those are two tests. No conditional logic, no branching, no try/catch — a test is a straight line.

Assert behavior, not implementation

If someone refactored the internals but the function still returned the correct result, would this test break? If yes, you're testing wiring, not behavior.

typescript

// Bad: asserts nothing meaningful
it('handles the request', async () => {
  expect(() => handler(mockReq)).not.toThrow();
});

// Good: asserts on the observable outcome
it('sets transaction name from route path', () => {
  responseHandler(createMockContext(200));

  expect(mockSetTransactionName).toHaveBeenCalledWith('GET /test');
});

Precise assertions

Default to exact matching. toMatchObject, expect.objectContaining, and expect.arrayContaining silently ignore fields that matter. This has caused real bugs to ship in this codebase.

Use toEqual by default. The same applies to toHaveBeenCalledWith — spell out every argument rather than wrapping in objectContaining. This is the single most common place where loose assertions creep in:

typescript

// Bad: silently ignores any missing or extra properties in the call
expect(startSpan).toHaveBeenCalledWith(expect.objectContaining({ name: 'middleware', op: 'middleware.hono' }));

// Good: exact match on the full argument — if the shape changes, the test catches it
expect(startSpan).toHaveBeenCalledWith({
  name: 'middleware',
  op: 'middleware.hono',
  onlyIfParent: true,
  parentSpan: fakeRootSpan,
  attributes: { 'sentry.op': 'middleware.hono', 'sentry.origin': 'auto.middleware.hono' },
});

The only valid reasons to use toMatchObject or objectContaining are: (1) the object is generated by a framework or third-party library and contains fields you don't control (timestamps, random IDs, internal framework state), or (2) the object has 10+ fields and the test only cares about 2–3 of them (in which case individual .toBe() checks on those fields are still preferred). If you wrote the object being asserted, you can spell it out — use toEqual.

When you do fall back, prefer individual .toBe() checks over objectContaining:

typescript

expect(event.transaction).toBe('GET /users/:id');
expect(event.contexts?.trace?.op).toBe('http.server');

Every toContain / toContainEqual needs a toHaveLength companion. Without it, the assertion passes even if the array has unexpected extra items:

typescript

// Bad: doesn't notice extra unexpected spans
expect(spanNames).toContain('authMiddleware');

// Good: locks down both content and count
expect(spanNames).toHaveLength(1);
expect(spanNames).toContain('authMiddleware');

Use exported constants, not magic numbers. If the code under test uses named constants like SPAN_STATUS_OK, reference those same constants in assertions. If the constant's value ever changes, tests using magic numbers silently pass with wrong expectations.

Naming

Names should be concise, descriptive, and read as correct English. Lead with the verb.

Quality	Example
Good	`'captures error when context.error is set'`
Good	`'does not re-capture errors already captured by wrapMiddlewareWithSpan'`
Good	`'returns empty array when no items match'`
Bad	`'should correctly return the formatted price string when given a valid positive number'`
Bad	`'test error handling'` / `'works correctly'`

Drop "should" — it adds words without adding meaning.

Input quality

Use realistic data

typescript

// Weak
const url = 'http://test';

// Strong — exercises URL parsing, path handling, query strings
const url = 'https://api.example.com/users/42?include=profile&format=json';

Boundary Value Analysis

If the valid range is 1–100, test -2, -1, 0, 1, 2, 99, 100, 101, Number.POSITIVE_INFINITY. Bugs cluster at boundaries — off-by-one errors, inclusive/exclusive confusion, type coercion.

Test the unhappy path as hard as the happy path

Empty inputs: '', [], {}, undefined, null
Falsy-but-valid: 0, false, '', NaN — these trip up loose truthiness checks
Error conditions: network failure, malformed input, missing required fields, timeout
Concurrency: what if called twice simultaneously? What if called after cleanup?

Each edge case gets its own test with a descriptive name.

Writing Vitest tests

File structure

Name test files *.test.ts, mirroring the source path: src/shared/patchRoute.ts → test/shared/patchRoute.test.ts.
Import the module under test from its source path (../../src/...). But when importing from a different package (e.g., @sentry/core in a @sentry/node test), use the package name — that's a real dependency, not the code under test.
For browser-environment tests: /** @vitest-environment jsdom */ at top of file.

Mocking

Prefer spies and stubs over full module mocks. A spy observes behavior without replacing the system under test. A full mock replaces it — and now you're testing your mock, not your code.

typescript

const warnSpy = vi.spyOn(SentryCore.debug, 'warn');
sentry(app);
expect(warnSpy).toHaveBeenCalledWith(expect.stringContaining('not initialized'));

When you need vi.mock: If the package's existing tests use string paths (vi.mock('../../src/utils')), match that style. If you're creating the first test file for a package, prefer the import form for type safety:

typescript

vi.mock(import('../../src/utils'), async importOriginal => {
  const actual = await importOriginal();
  return { ...actual, helperFn: vi.fn() };
});

Always restore mocks. This repo does not set restoreMocks: true globally — you are responsible for cleanup. Leaked mocks cause mysterious failures in unrelated tests. Use whatever cleanup the existing tests in your package use. If creating the first test file, use:

typescript

beforeEach(() => {
  vi.restoreAllMocks();
});

Error testing

Use the library's built-in matchers. Never use try/catch in tests.

typescript

expect(() => parseConfig(null)).toThrow('config is required');
await expect(asyncOp()).rejects.toThrow(TimeoutError);

For async callbacks where you need to verify an assertion actually ran, use expect.assertions(n).

Parameterized tests (Vitest)

Use it.each or it.for for data-driven cases — not raw for loops. it.each integrates with the test runner, gives better output, and each case runs independently.

typescript

it.each([
  { input: 0, expected: 'zero' },
  { input: 1, expected: 'one' },
  { input: -1, expected: 'negative' },
])('classifies $input as $expected', ({ input, expected }) => {
  expect(classify(input)).toBe(expected);
});

Test isolation

Tests must never depend on execution order or share mutable state. For this codebase, many tests need to reset global Sentry state:

typescript

beforeEach(() => {
  clearGlobalScope();
  getCurrentScope().clear();
  getIsolationScope().clear();
});

Grouping

1-2 levels of describe is usually enough. Deeper nesting makes tests harder to find and read.

typescript

describe('patchRoute', () => {
  describe('sub-app middleware wrapping', () => {
    it('wraps .use() middleware handlers', async () => { ... });
    it('does not wrap sole route handlers', async () => { ... });
  });
});

Writing Playwright E2E tests

When to write E2E tests

Write E2E tests when you need to verify that the SDK correctly instruments a real application. Unit tests can't catch integration bugs between the SDK and a framework's request lifecycle. Also use the /e2e skill for running E2E tests.

File structure

Tests live in dev-packages/e2e-tests/test-applications/<app-name>/tests/*.test.ts.
Shared constants (like APP_NAME) go in tests/constants.ts.
Each test app has a playwright.config.ts using getPlaywrightConfig from @sentry-internal/test-utils.

The waitFor pattern

Set up a promise for the expected Sentry event, trigger the action, then await and assert.

typescript

test('captures transaction for GET /users/:id', async ({ baseURL }) => {
  const transactionPromise = waitForTransaction(APP_NAME, event => {
    return event.contexts?.trace?.op === 'http.server' && event.transaction === 'GET /users/:id';
  });

  const response = await fetch(`${baseURL}/users/42`);
  expect(response.status).toBe(200);

  const transaction = await transactionPromise;
  expect(transaction.contexts?.trace?.op).toBe('http.server');
  expect(transaction.transaction).toBe('GET /users/:id');
});

The predicate must be specific enough to match only your event. A vague predicate can match an unrelated event from a parallel test, causing flaky passes or hangs.

Asserting on spans

Prefer asserting on the exact span count alongside individual field checks:

typescript

const spans = transaction.spans || [];
expect(spans).toHaveLength(2);

const middlewareSpan = spans.find(s => s.description === 'middlewareA');
expect(middlewareSpan?.op).toBe('middleware.hono');
expect(middlewareSpan?.origin).toBe('auto.middleware.hono');
expect(middlewareSpan?.status).toBe('ok');

Error event assertions

Check both the exception value and the mechanism. The mechanism tells you how the error was captured — that's the SDK's actual responsibility:

typescript

const errorEvent = await errorPromise;
expect(errorEvent.exception?.values?.[0]?.value).toBe('connection refused');

const mechanism = errorEvent.exception?.values?.[0]?.mechanism;
expect(mechanism?.handled).toBe(false);
expect(mechanism?.type).toBe('auto.http.hono.context_error');

Parameterized E2E tests

For Playwright tests (unlike Vitest), for...of loops are the established codebase convention. Use for...of (not .forEach()) so Playwright's test registration works correctly:

typescript

for (const { name, prefix } of SCENARIOS) {
  test.describe(name, () => {
    test('captures named middleware span', async ({ baseURL }) => {
      // ...
    });
  });
}

Common pitfalls

Proxy name mismatch: APP_NAME must match proxyServerName in start-event-proxy.mjs.
Flaky predicates: Add enough specificity (path, method, unique marker) to disambiguate.
Forgetting await: The waitFor* helpers return a promise. Without await, the test passes vacuously and the assertion never runs.

Checklist

Before you're done, verify each test against these criteria: