Back to Sanity

Visual Regression Testing

.agents/skills/playwright-best-practices/testing-patterns/visual-regression.md

5.24.018.5 KB
Original Source

Visual Regression Testing

Table of Contents

  1. Quick Reference
  2. Patterns
  3. Decision Guide
  4. Anti-Patterns
  5. Troubleshooting

When to use: Detecting unintended visual changes—layout shifts, style regressions, broken responsive designs—that functional assertions miss.

Quick Reference

typescript
// Element screenshot
await expect(page.getByTestId('product-card')).toHaveScreenshot();

// Full page screenshot
await expect(page).toHaveScreenshot('landing-hero.png');

// Threshold for minor pixel variance
await expect(page).toHaveScreenshot({ maxDiffPixelRatio: 0.01 });

// Mask volatile content
await expect(page).toHaveScreenshot({
  mask: [page.getByTestId('clock'), page.getByRole('img', { name: 'User photo' })],
});

// Disable CSS animations
await expect(page).toHaveScreenshot({ animations: 'disabled' });

// Update baselines
npx playwright test --update-snapshots

Patterns

Masking Volatile Content

Use when: Page contains timestamps, avatars, ad slots, relative dates, random images, or A/B variants.

The mask option overlays a solid box over specified locators before capturing.

typescript
test('analytics panel with masked dynamic elements', async ({page}) => {
  await page.goto('/analytics')

  await expect(page).toHaveScreenshot('analytics.png', {
    mask: [
      page.getByTestId('last-updated'),
      page.getByTestId('profile-avatar'),
      page.getByTestId('active-users'),
      page.locator('.promo-banner'),
    ],
    maskColor: '#FF00FF',
  })
})

test('activity stream with relative times', async ({page}) => {
  await page.goto('/activity')

  await expect(page).toHaveScreenshot('activity.png', {
    mask: [page.locator('time[datetime]')],
  })
})

Alternative: freeze content with JavaScript when masking affects layout:

typescript
test('freeze timestamps before capture', async ({page}) => {
  await page.goto('/analytics')

  await page.evaluate(() => {
    document.querySelectorAll('[data-testid="time-display"]').forEach((el) => {
      el.textContent = 'Jan 1, 2025 12:00 PM'
    })
  })

  await expect(page).toHaveScreenshot('analytics-frozen.png')
})

Disabling Animations

Use when: Always. CSS animations and transitions are the primary cause of flaky visual diffs.

typescript
test('renders without animation interference', async ({page}) => {
  await page.goto('/')

  await expect(page).toHaveScreenshot('home.png', {
    animations: 'disabled',
  })
})

Set globally in config:

typescript
// playwright.config.ts
export default defineConfig({
  expect: {
    toHaveScreenshot: {
      animations: 'disabled',
    },
  },
})

When animations: 'disabled' is set, Playwright injects CSS forcing animation/transition duration to 0s, waits for running animations to finish, then captures.

For JavaScript-driven animations (GSAP, Framer Motion), wait for stability:

typescript
test('page with JS animations', async ({page}) => {
  await page.goto('/animated-hero')

  const heroBanner = page.getByTestId('hero-banner')
  await heroBanner.waitFor({state: 'visible'})

  // Wait for animation to complete by checking for stable state
  await expect(heroBanner).not.toHaveClass(/animating/)

  await expect(page).toHaveScreenshot('hero.png', {
    animations: 'disabled',
  })
})

Configuring Thresholds

Use when: Minor rendering differences from anti-aliasing, font hinting, or sub-pixel rendering cause false failures.

OptionControlsTypical Value
maxDiffPixelsAbsolute pixel count that can differ100 for pages, 10 for components
maxDiffPixelRatioFraction of total pixels (0-1)0.01 (1%) for pages
thresholdPer-pixel color tolerance (0-1)0.2 for most UIs, 0.1 for design systems
typescript
test('control panel allows minor variance', async ({page}) => {
  await page.goto('/control-panel')

  await expect(page).toHaveScreenshot('control-panel.png', {
    maxDiffPixelRatio: 0.01,
  })
})

test('brand logo renders pixel-perfect', async ({page}) => {
  await page.goto('/brand')

  await expect(page.getByTestId('brand-logo')).toHaveScreenshot('brand-logo.png', {
    maxDiffPixels: 0,
    threshold: 0,
  })
})

test('graph allows anti-aliasing differences', async ({page}) => {
  await page.goto('/reports')

  await expect(page.getByTestId('sales-graph')).toHaveScreenshot('sales-graph.png', {
    threshold: 0.3,
    maxDiffPixels: 200,
  })
})

Global thresholds in config:

typescript
// playwright.config.ts
export default defineConfig({
  expect: {
    toHaveScreenshot: {
      maxDiffPixelRatio: 0.01,
      threshold: 0.2,
      animations: 'disabled',
    },
  },
})

CI Configuration

Use when: Running visual tests in CI. Consistent rendering is critical—the same test must produce identical screenshots every time.

The problem: Font rendering and anti-aliasing differ across operating systems. macOS snapshots won't match Linux.

The solution: Run visual tests in Docker using the official Playwright container. Generate and update snapshots from the same container.

GitHub Actions with Docker

yaml
# .github/workflows/visual-tests.yml
name: Visual Regression Tests
on: [push, pull_request]

jobs:
  visual-tests:
    runs-on: ubuntu-latest
    container:
      image: mcr.microsoft.com/playwright:v1.48.0-noble
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: lts/*
          cache: npm

      - run: npm ci

      - name: Run visual tests
        run: npx playwright test --project=visual
        env:
          HOME: /root

      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: visual-test-report
          path: playwright-report/
          retention-days: 14

Updating snapshots locally using Docker:

bash
docker run --rm -v $(pwd):/work -w /work \
  mcr.microsoft.com/playwright:v1.48.0-noble \
  npx playwright test --update-snapshots --project=visual

Add script to package.json:

json
{
  "scripts": {
    "test:visual": "npx playwright test --project=visual",
    "test:visual:update": "docker run --rm -v $(pwd):/work -w /work mcr.microsoft.com/playwright:v1.48.0-noble npx playwright test --update-snapshots --project=visual"
  }
}

Platform-agnostic snapshots (requires Docker for generation):

typescript
// playwright.config.ts
export default defineConfig({
  snapshotPathTemplate: '{testDir}/{testFileDir}/{testFileName}-snapshots/{arg}{-projectName}{ext}',
  projects: [
    {
      name: 'visual',
      testMatch: '**/*.visual.spec.ts',
      use: {...devices['Desktop Chrome']},
    },
  ],
})

Full Page vs Element Screenshots

Use when: Deciding scope. Full page catches layout shifts. Element screenshots isolate components and are more stable.

typescript
test('full page captures layout shifts', async ({page}) => {
  await page.goto('/')

  // Visible viewport
  await expect(page).toHaveScreenshot('home-viewport.png')

  // Entire scrollable page
  await expect(page).toHaveScreenshot('home-full.png', {
    fullPage: true,
  })
})

test('element screenshot isolates component', async ({page}) => {
  await page.goto('/catalog')

  await expect(page.getByRole('table')).toHaveScreenshot('catalog-table.png')
  await expect(page.getByTestId('featured-item')).toHaveScreenshot('featured-item.png')
})

Rule of thumb: Element screenshots for independently changing components. Full page screenshots for key layouts where spacing matters.

Responsive Visual Testing

Use when: Application has responsive breakpoints requiring verification at different viewport sizes.

typescript
const breakpoints = [
  {name: 'phone', width: 375, height: 812},
  {name: 'tablet', width: 768, height: 1024},
  {name: 'desktop', width: 1440, height: 900},
]

for (const bp of breakpoints) {
  test(`landing at ${bp.name} (${bp.width}x${bp.height})`, async ({page}) => {
    await page.setViewportSize({width: bp.width, height: bp.height})
    await page.goto('/')

    await expect(page).toHaveScreenshot(`landing-${bp.name}.png`, {
      animations: 'disabled',
      fullPage: true,
    })
  })
}

Alternative: use projects for responsive testing:

typescript
// playwright.config.ts
export default defineConfig({
  projects: [
    {
      name: 'desktop',
      testMatch: '**/*.visual.spec.ts',
      use: {
        ...devices['Desktop Chrome'],
        viewport: {width: 1440, height: 900},
      },
    },
    {
      name: 'tablet',
      testMatch: '**/*.visual.spec.ts',
      use: {...devices['iPad (gen 7)']},
    },
    {
      name: 'mobile',
      testMatch: '**/*.visual.spec.ts',
      use: {...devices['iPhone 14']},
    },
  ],
})

Component Visual Testing

Use when: Testing individual UI components in isolation—buttons, cards, forms, modals. Faster and more stable than full-page screenshots.

typescript
test.describe('Button visual states', () => {
  test('primary button', async ({page}) => {
    await page.goto('/storybook/iframe.html?id=button--primary')
    const btn = page.getByRole('button')
    await expect(btn).toHaveScreenshot('btn-primary.png', {
      animations: 'disabled',
    })
  })

  test('primary button hover', async ({page}) => {
    await page.goto('/storybook/iframe.html?id=button--primary')
    const btn = page.getByRole('button')
    await btn.hover()
    await expect(btn).toHaveScreenshot('btn-primary-hover.png', {
      animations: 'disabled',
    })
  })

  test('button sizes', async ({page}) => {
    for (const size of ['small', 'medium', 'large']) {
      await page.goto(`/storybook/iframe.html?id=button--${size}`)
      const btn = page.getByRole('button')
      await expect(btn).toHaveScreenshot(`btn-${size}.png`, {
        animations: 'disabled',
      })
    }
  })
})

Using a dedicated test harness instead of Storybook:

typescript
test.describe('Card component', () => {
  test.beforeEach(async ({page}) => {
    await page.goto('/test-harness/card')
  })

  test('default state', async ({page}) => {
    await expect(page.getByTestId('card')).toHaveScreenshot('card-default.png', {
      animations: 'disabled',
    })
  })

  test('truncates long content', async ({page}) => {
    await page.goto('/test-harness/card?content=long')
    await expect(page.getByTestId('card')).toHaveScreenshot('card-long.png', {
      animations: 'disabled',
    })
  })
})

Updating Snapshots

Use when: Intentionally changed UI—design refresh, rebrand, new feature. Never update when diff is unexpected.

bash
# Update all snapshots
npx playwright test --update-snapshots

# Update for specific file
npx playwright test tests/landing.spec.ts --update-snapshots

# Update for specific project
npx playwright test --project=chromium --update-snapshots

Workflow for reviewing changes:

  1. Run tests and view failures in HTML report:

    bash
    npx playwright test
    npx playwright show-report
    

    The report shows expected, actual, and diff images side-by-side.

  2. If changes are intentional, update:

    bash
    npx playwright test --update-snapshots
    
  3. Review updated snapshots before committing:

    bash
    git diff --name-only
    

Tag visual tests for selective updates:

typescript
test('landing visual @visual', async ({page}) => {
  await page.goto('/')
  await expect(page).toHaveScreenshot('landing.png', {
    animations: 'disabled',
  })
})
bash
npx playwright test --grep @visual --update-snapshots

Cross-Browser Visual Testing

Use when: Users span Chrome, Firefox, Safari and you need per-browser rendering verification.

Playwright separates snapshots by project name automatically. Each browser gets its own baseline—browsers render fonts and shadows differently.

typescript
// playwright.config.ts
export default defineConfig({
  expect: {
    toHaveScreenshot: {
      animations: 'disabled',
      maxDiffPixelRatio: 0.01,
    },
  },
  projects: [
    {
      name: 'chromium',
      use: {...devices['Desktop Chrome']},
    },
    {
      name: 'firefox',
      use: {...devices['Desktop Firefox']},
    },
    {
      name: 'webkit',
      use: {...devices['Desktop Safari']},
    },
  ],
})

Strategy: Run visual tests in a single browser (Chromium on Linux in CI) to minimize snapshot count. Add other browsers only when you have actual cross-browser rendering bugs:

typescript
// playwright.config.ts
export default defineConfig({
  projects: [
    {
      name: 'visual',
      testMatch: '**/*.visual.spec.ts',
      use: {...devices['Desktop Chrome']},
    },
    {
      name: 'chromium',
      testIgnore: '**/*.visual.spec.ts',
      use: {...devices['Desktop Chrome']},
    },
    {
      name: 'firefox',
      testIgnore: '**/*.visual.spec.ts',
      use: {...devices['Desktop Firefox']},
    },
  ],
})

Decision Guide

ScenarioApproachRationale
Key landing/marketing pagesFull page, fullPage: trueCatches layout shifts, spacing, overall harmony
Individual componentsElement screenshotIsolated, fast, immune to unrelated changes
Page with dynamic contentFull page + maskCovers layout while ignoring volatile content
Design system libraryElement per variant, zero thresholdPixel-perfect enforcement
Responsive verificationScreenshot per viewportCatches breakpoint bugs
Cross-browser consistencySeparate snapshots per browserBrowsers render differently
CI pipelineDocker container, Linux-only snapshotsConsistent rendering
Threshold: design systemthreshold: 0, maxDiffPixels: 0Zero tolerance
Threshold: content pagesmaxDiffPixelRatio: 0.01, threshold: 0.2Minor anti-aliasing variance
Threshold: charts/graphsmaxDiffPixels: 200, threshold: 0.3Anti-aliasing on curves varies

Anti-Patterns

Don'tProblemDo Instead
Visual test every pageMassive maintenance, constant false failuresPick 5-10 key pages and critical components
Skip masking dynamic contentScreenshots differ every run, permanently flakyUse mask for all volatile elements
Run across macOS, Linux, WindowsFont rendering differs, snapshots never matchStandardize on Linux via Docker
Skip Docker in CIOS updates shift rendering silentlyPin specific Playwright Docker image
Blindly run --update-snapshotsAccepts unintentional regressionsAlways review diff in HTML report first
Skip animations: 'disabled'CSS transitions create random diffsSet globally in config
Replace functional assertions with visual testsDiffs don't tell you what brokeVisual tests complement, never replace
Commit snapshots from different platformsTests fail for everyoneAll team members use same Docker container
Set threshold too high (0.1)10% pixel change passes, defeats purposeStart with 0.01, adjust per-test
Full page on infinite scroll pagesPage height nondeterministicElement screenshots on above-the-fold content

Troubleshooting

"Screenshot comparison failed" on first CI run after local development

Cause: Snapshots generated on macOS locally. CI runs on Linux. Font rendering differs.

Fix: Generate snapshots using Docker:

bash
docker run --rm -v $(pwd):/work -w /work \
  mcr.microsoft.com/playwright:v1.48.0-noble \
  npx playwright test --update-snapshots --project=visual

Commit Linux-generated snapshots.

"Expected screenshot to match but X pixels differ"

Cause: Anti-aliasing, font hinting, sub-pixel rendering differences.

Fix: Add tolerance:

typescript
await expect(page).toHaveScreenshot('page.png', {
  maxDiffPixelRatio: 0.01,
  threshold: 0.2,
})

Check HTML report diff image to determine if it's regression or noise.

Visual tests pass locally but fail in CI (even with Docker)

Cause: Different Playwright versions locally vs CI.

Fix: Ensure package.json version matches Docker image tag:

json
{
  "devDependencies": {
    "@playwright/test": "latest"
  }
}
yaml
container:
  image: mcr.microsoft.com/playwright:v1.48.0-noble

Animations cause random diff failures

Cause: CSS animations captured mid-frame.

Fix: Set animations: 'disabled' globally:

typescript
// playwright.config.ts
export default defineConfig({
  expect: {
    toHaveScreenshot: {
      animations: 'disabled',
    },
  },
})

For JS animations, wait for stable state before capture.

Snapshot file names conflict between tests

Cause: Two tests use same screenshot name without unique paths.

Fix: Use explicit unique names:

typescript
await expect(page).toHaveScreenshot('auth-home.png')
await expect(page).toHaveScreenshot('public-home.png')

Or customize snapshot path template:

typescript
export default defineConfig({
  snapshotPathTemplate: '{testDir}/{testFileDir}/{testFileName}-snapshots/{arg}{-projectName}{ext}',
})

Too many snapshot files to maintain

Cause: Visual tests for every page, browser, viewport.

Fix: Be selective. Visual test only high-risk pages:

  • Landing and marketing pages
  • Design system components
  • Complex layouts (dashboards, data tables)
  • Pages after major refactor

Skip pages where functional assertions cover key elements.