Back to Prisma

Task: Review and Refine Interpreter Benchmarks

docs/plans/benchmark-improvements/004-review-interpreter-benchmarks.md

7.9.0-dev.45.9 KB
Original Source

Task: Review and Refine Interpreter Benchmarks

Overview

Review the interpreter benchmarks in packages/client-engine-runtime/bench/interpreter.bench.ts to ensure they accurately measure the performance overhead of the query interpreter and data mapper components in isolation from database I/O.

Current State

The interpreter benchmark suite includes:

  • Mock driver adapter (MockDriverAdapter) that returns pre-defined results
  • Query plan definitions for various operations (SELECT, findUnique, JOIN, SEQUENCE, deep nested JOIN)
  • Data mapper benchmarks with varying row counts
  • SQL serializer benchmarks

Review Areas

1. Mock Adapter Fidelity

Review MockDriverAdapter implementation:

AspectCurrent StateReview Items
ProvidersqliteConsider testing other providers
Result formatBasic column/row structureVerify matches real adapter output
Transaction mockMinimal implementationEnsure representative overhead
Error handlingNot testedAdd error path benchmarks?

Questions to address:

  • Does the mock adapter accurately represent real adapter call overhead?
  • Should we parameterize provider type for comparison?
  • Is the mock transaction implementation representative?

2. Query Plan Coverage

Review query plan definitions:

PlanDescriptionReview Items
SIMPLE_SELECT_PLANBasic SELECT query✅ Good baseline
FIND_UNIQUE_PLANSingle record lookupCheck arg types match runtime
JOIN_PLAN1:N relationship joinVerify join structure
SEQUENCE_PLANMulti-step queryAdd more sequence variations
DEEP_JOIN_PLANNested joins (3 levels)Good for complex queries

Missing query plans to consider:

  • Aggregate query plans
  • Write operation plans (INSERT, UPDATE, DELETE)
  • Plans with WHERE clauses and filtering
  • Plans with ordering and pagination

3. Data Mapper Benchmarks

Current data mapper coverage:

BenchmarkRowsNestingReview
dataMapper: 10 rows10FlatBaseline
dataMapper: 50 rows50FlatMedium
dataMapper: 100 rows100FlatLarge
dataMapper: nested 5x35 users × 3 posts2 levelsSmall nested
dataMapper: nested 10x510 users × 5 posts2 levelsMedium nested
dataMapper: nested 20x1020 users × 10 posts2 levelsLarge nested

Missing scenarios:

  • Deeply nested data (3+ levels)
  • Wide rows (many columns)
  • Sparse data (many nulls)
  • Different data types (DateTime, Decimal, BigInt)

4. Serializer Benchmarks

Current serializer coverage:

BenchmarkRowsColumnsReview
serializer: 10 rows x 3 cols103Minimal
serializer: 50 rows x 8 cols508Medium
serializer: 100 rows x 8 cols1008Large

Consider adding:

  • Very wide rows (20+ columns)
  • Very large result sets (1000+ rows)
  • Different data type serialization

5. Benchmark Isolation

Verify measurements are isolated:

  • Mock adapter has minimal overhead
  • No actual database calls
  • Memory allocation patterns are consistent
  • GC doesn't affect measurements

Action Items

High Priority

  1. Verify query plan accuracy:

    • Compare plan structures with actual compiler output
    • Ensure args, argTypes, and structure fields are accurate
    • Test with real query compiler to validate plans
  2. Add write operation benchmarks:

    • INSERT query plan and interpreter execution
    • UPDATE query plan and interpreter execution
    • DELETE query plan and interpreter execution
  3. Add aggregate benchmarks:

    • COUNT query plan
    • SUM/AVG/MIN/MAX plans
    • GROUP BY plans

Medium Priority

  1. Expand data type coverage:

    • DateTime handling
    • Decimal precision
    • BigInt serialization
    • JSON/JSONB mapping (for PostgreSQL)
  2. Add edge case benchmarks:

    • Empty result sets
    • Single column results
    • Very wide rows (50+ columns)
    • Deep nesting (4+ levels)

Low Priority

  1. Provider-specific benchmarks:

    • Test interpreter with different provider configurations
    • Measure provider-specific code path overhead
  2. Memory benchmarks:

    • Track allocation counts
    • Measure GC pressure for large result sets

Validation Steps

After refinements:

bash
# Run interpreter benchmarks
pnpm bench interpreter

# Verify no regressions
pnpm bench

# Profile for accuracy
node --cpu-prof -r esbuild-register packages/client-engine-runtime/bench/interpreter.bench.ts

Output Expectations

After review and refinements:

  • Interpreter benchmarks should cover 20-30 scenarios
  • Each benchmark should run in < 100ms per iteration
  • Results should be stable (±3% variance between runs)
  • Clear separation between interpreter, mapper, and serializer overhead

Code Structure Recommendations

Consider refactoring the benchmark file:

  • Extract query plan definitions to separate file
  • Create helper functions for common patterns
  • Add benchmark metadata/descriptions

Dependencies

  • None (can be done independently)
  • Task 001: Migrate to tinybench (simpler async handling)
  • Task 003: Review query performance benchmarks (ensure consistency)