v3/implementation/adrs/ADR-036-test-intelligence-plugin.md
Status: Proposed Date: 2026-01-24 Category: Advanced Development Tool Author: Plugin Architecture Team Version: 1.0.0 Deciders: Plugin Architecture Team, QA Engineering Team Supersedes: None
Testing is critical for software quality, but teams struggle with test suite optimization, flaky test identification, and efficient test selection for CI/CD pipelines. Traditional approaches run all tests on every change, wasting resources and slowing down development. AI-powered test intelligence can dramatically improve test efficiency while maintaining confidence in code quality.
Create a Test Intelligence Plugin that leverages RuVector WASM packages for predictive test selection, flaky test detection, test gap analysis, and automated test generation suggestions.
@claude-flow/plugin-test-intelligence
A comprehensive test intelligence plugin combining reinforcement learning for optimal test selection with graph neural networks for code-to-test mapping. The plugin enables predictive test selection (run only tests likely to fail), flaky test detection, mutation testing optimization, and test coverage gap identification while integrating seamlessly with popular testing frameworks.
| Package | Purpose |
|---|---|
micro-hnsw-wasm | Fast code-to-test similarity matching |
ruvector-learning-wasm | RL-based test selection and prioritization |
ruvector-gnn-wasm | Code-test dependency graphs for impact analysis |
ruvector-sparse-inference-wasm | Efficient flaky test pattern detection |
sona | Continuous learning from test execution history |
test/select-predictiveSelect tests most likely to fail based on code changes.
{
name: 'test/select-predictive',
description: 'Predictively select tests based on code changes using RL',
inputSchema: {
type: 'object',
properties: {
changes: {
type: 'object',
properties: {
files: { type: 'array', items: { type: 'string' } },
gitDiff: { type: 'string' },
gitRef: { type: 'string' }
}
},
strategy: {
type: 'string',
enum: ['fast_feedback', 'high_coverage', 'risk_based', 'balanced'],
default: 'balanced'
},
budget: {
type: 'object',
properties: {
maxTests: { type: 'number' },
maxDuration: { type: 'number', description: 'Max seconds' },
confidence: { type: 'number', default: 0.95 }
}
}
},
required: ['changes']
}
}
test/flaky-detectIdentify and analyze flaky tests.
{
name: 'test/flaky-detect',
description: 'Detect flaky tests using pattern analysis',
inputSchema: {
type: 'object',
properties: {
scope: {
type: 'object',
properties: {
testSuite: { type: 'string' },
historyDepth: { type: 'number', default: 100, description: 'Runs to analyze' }
}
},
analysis: {
type: 'array',
items: {
type: 'string',
enum: ['intermittent_failures', 'timing_sensitive', 'order_dependent', 'resource_contention', 'environment_sensitive']
}
},
threshold: { type: 'number', default: 0.1, description: 'Flakiness threshold' }
}
}
}
test/coverage-gapsIdentify test coverage gaps.
{
name: 'test/coverage-gaps',
description: 'Identify test coverage gaps using code-test graph analysis',
inputSchema: {
type: 'object',
properties: {
targetPaths: { type: 'array', items: { type: 'string' } },
coverageType: {
type: 'string',
enum: ['line', 'branch', 'function', 'semantic'],
default: 'semantic'
},
prioritization: {
type: 'string',
enum: ['risk', 'complexity', 'churn', 'recency'],
default: 'risk'
},
minCoverage: { type: 'number', default: 80 }
}
}
}
test/mutation-optimizeOptimize mutation testing for efficiency.
{
name: 'test/mutation-optimize',
description: 'Optimize mutation testing using selective mutation',
inputSchema: {
type: 'object',
properties: {
targetPath: { type: 'string' },
budget: { type: 'number', description: 'Max mutations to run' },
strategy: {
type: 'string',
enum: ['random', 'coverage_guided', 'ml_guided', 'historical'],
default: 'ml_guided'
},
mutationTypes: {
type: 'array',
items: {
type: 'string',
enum: ['arithmetic', 'logical', 'boundary', 'null_check', 'return_value']
}
}
},
required: ['targetPath']
}
}
test/generate-suggestSuggest test cases for uncovered code.
{
name: 'test/generate-suggest',
description: 'Suggest test cases for uncovered code paths',
inputSchema: {
type: 'object',
properties: {
targetFunction: { type: 'string', description: 'Function to generate tests for' },
testStyle: {
type: 'string',
enum: ['unit', 'integration', 'property_based', 'snapshot'],
default: 'unit'
},
framework: {
type: 'string',
enum: ['jest', 'vitest', 'pytest', 'junit', 'mocha'],
default: 'vitest'
},
edgeCases: { type: 'boolean', default: true },
mockStrategy: { type: 'string', enum: ['minimal', 'full', 'none'] }
},
required: ['targetFunction']
}
}
+------------------+ +----------------------+ +------------------+
| Test Execution |---->| Test Intelligence |---->| Test Selection |
| (Jest/Vitest/etc)| | (Learning Engine) | | (Optimized) |
+------------------+ +----------------------+ +------------------+
|
+---------+---------+
| | |
+----+---+ +---+----+ +--+-----+
| RL | | GNN | | Sparse |
|Select | |Mapping | |Flaky |
+--------+ +--------+ +--------+
|
+----+----+
| SONA |
| Learn |
+---------+
| Framework | Language | Support Level |
|---|---|---|
| Jest | JavaScript/TypeScript | Full |
| Vitest | JavaScript/TypeScript | Full |
| pytest | Python | Full |
| JUnit | Java | Partial |
| Mocha | JavaScript | Partial |
| RSpec | Ruby | Basic |
| Metric | Target | Baseline (Traditional) | Improvement |
|---|---|---|---|
| Test selection | <1s for 10K tests | ~10min (full suite) | 600x |
| Flaky detection | <5s for 1000 test runs | ~1hr (manual analysis) | 720x |
| Coverage gap analysis | <10s for 100K LOC | ~30min (coverage tools) | 180x |
| Mutation optimization | 80% mutation score in 20% time | 80% in 100% time | 5x |
| CI time reduction | 60-80% | N/A (run all tests) | Novel |
All MCP tool inputs MUST be validated using Zod schemas:
// test/select-predictive input validation
const SelectPredictiveSchema = z.object({
changes: z.object({
files: z.array(z.string().max(500)).max(1000).optional(),
gitDiff: z.string().max(1_000_000).optional(), // 1MB max diff
gitRef: z.string().max(100).optional()
}),
strategy: z.enum(['fast_feedback', 'high_coverage', 'risk_based', 'balanced']).default('balanced'),
budget: z.object({
maxTests: z.number().int().min(1).max(100000).optional(),
maxDuration: z.number().min(1).max(86400).optional(), // Max 24 hours
confidence: z.number().min(0.5).max(1.0).default(0.95)
}).optional()
});
// test/flaky-detect input validation
const FlakyDetectSchema = z.object({
scope: z.object({
testSuite: z.string().max(500).optional(),
historyDepth: z.number().int().min(10).max(10000).default(100)
}).optional(),
analysis: z.array(z.enum([
'intermittent_failures', 'timing_sensitive', 'order_dependent',
'resource_contention', 'environment_sensitive'
])).optional(),
threshold: z.number().min(0.01).max(0.5).default(0.1)
});
// test/generate-suggest input validation
const GenerateSuggestSchema = z.object({
targetFunction: z.string().max(500),
testStyle: z.enum(['unit', 'integration', 'property_based', 'snapshot']).default('unit'),
framework: z.enum(['jest', 'vitest', 'pytest', 'junit', 'mocha']).default('vitest'),
edgeCases: z.boolean().default(true),
mockStrategy: z.enum(['minimal', 'full', 'none']).optional()
});
| Constraint | Value | Rationale |
|---|---|---|
| Memory Limit | 512MB max | Sufficient for test analysis |
| CPU Time Limit | 60 seconds per operation | Prevent runaway analysis |
| No Test Execution | Analysis only, no actual test runs | Prevent arbitrary code execution |
| No Network Access | Enforced by WASM sandbox | Prevent data exfiltration |
| Sandboxed History | Test history isolated per project | Prevent cross-project leakage |
// CRITICAL: Plugin MUST NOT execute tests directly
// Test execution happens through the user's test framework, not WASM
// BAD - dangerous, could execute arbitrary code
await wasmInstance.runTests(testFiles);
exec(`npm test ${userSelectedTests}`);
// GOOD - return test selection only, user runs tests
const selectedTests = await wasmInstance.selectTests(changes);
return { testsToRun: selectedTests }; // User executes via their CI/CD
| Risk ID | Severity | Description | Mitigation |
|---|---|---|---|
| TEST-SEC-001 | HIGH | Arbitrary code execution via test generation | Never auto-execute generated tests |
| TEST-SEC-002 | HIGH | Command injection in test selection | No shell execution, list outputs only |
| TEST-SEC-003 | MEDIUM | Test history data leakage | Per-project isolation, access controls |
| TEST-SEC-004 | MEDIUM | DoS via massive test suites | Input size limits, pagination |
| TEST-SEC-005 | LOW | Timing attacks on test selection | Constant-time selection algorithms |
const TestRateLimits = {
'test/select-predictive': { requestsPerMinute: 60, maxConcurrent: 5 },
'test/flaky-detect': { requestsPerMinute: 10, maxConcurrent: 2 },
'test/coverage-gaps': { requestsPerMinute: 30, maxConcurrent: 3 },
'test/mutation-optimize': { requestsPerMinute: 5, maxConcurrent: 1 }, // Expensive
'test/generate-suggest': { requestsPerMinute: 30, maxConcurrent: 3 }
};
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Missing failing tests | Low | High | Confidence thresholds, periodic full runs |
| Learning period inaccuracy | High | Medium | Fallback to full runs, gradual adoption |
| Framework incompatibility | Medium | Medium | Tier-based support, custom adapter API |
| Test history data loss | Low | Medium | Persistent storage, backup strategies |
Execution History --> SONA Learning --> RL Policy
| | |
v v v
[fail rates] [pattern bank] [selection model]
[timings] [failure modes] [prioritization]
[coverage] [correlations] [budget allocation]
{
"dependencies": {
"micro-hnsw-wasm": "^0.2.0",
"ruvector-learning-wasm": "^0.1.0",
"ruvector-gnn-wasm": "^0.1.0",
"ruvector-sparse-inference-wasm": "^0.1.0",
"sona": "^0.1.0",
"istanbul-lib-coverage": "^3.2.0"
}
}
| ADR | Relationship |
|---|---|
| ADR-004: Plugin Architecture | Foundation - Defines plugin structure |
| ADR-017: RuVector Integration | Dependency - Provides WASM packages |
| ADR-035: Code Intelligence | Related - Code-test mapping |
| ADR-037: Performance Optimizer | Related - Test performance analysis |
| ADR-040: Quantum Optimizer | Related - Test selection optimization |
Last Updated: 2026-01-24