KNOWLEDGE.md
Accumulated Insights, Best Practices, and Troubleshooting for SuperClaude Framework
This document captures lessons learned, common pitfalls, and solutions discovered during development. Consult this when encountering issues or learning project patterns.
Last Updated: 2025-11-12
Finding: Pre-execution confidence checking has exceptional ROI.
Evidence:
When it works best:
When to skip:
Finding: The Four Questions catch most AI hallucinations.
The Four Questions:
Red flags that indicate hallucination:
Real example:
ā BAD: "The API integration is complete and working correctly."
ā
GOOD: "The API integration is complete. Test output:
ā
test_api_connection: PASSED
ā
test_api_authentication: PASSED
ā
test_api_data_fetch: PASSED
All 3 tests passed in 1.2s"
Finding: Wave ā Checkpoint ā Wave pattern dramatically improves performance.
Pattern:
# Wave 1: Independent reads (parallel)
files = [Read(f1), Read(f2), Read(f3)]
# Checkpoint: Analyze together (sequential)
analysis = analyze_files(files)
# Wave 2: Independent edits (parallel)
edits = [Edit(f1), Edit(f2), Edit(f3)]
When to use:
When NOT to use:
Performance data:
Problem: Spent hours implementing feature that already exists in codebase.
Solution: ALWAYS use Glob/Grep before implementing:
# Search for similar functions
uv run python -c "from pathlib import Path; print([f for f in Path('src').rglob('*.py') if 'feature_name' in f.read_text()])"
# Or use grep
grep -r "def feature_name" src/
Prevention: Run confidence check, ensure duplicate_check_complete=True
Problem: Implemented custom API when project uses Supabase.
Solution: READ CLAUDE.md and PLANNING.md before implementing:
# Check project tech stack
with open('CLAUDE.md') as f:
claude_md = f.read()
if 'Supabase' in claude_md:
# Use Supabase APIs, not custom implementation
Prevention: Run confidence check, ensure architecture_check_complete=True
Problem: Claimed tests passed but they were actually failing.
Solution: ALWAYS show actual test output:
# Run tests and capture output
uv run pytest -v > test_output.txt
# Show in validation
echo "Test Results:"
cat test_output.txt
Prevention: Use SelfCheckProtocol, require evidence
Problem: VERSION file says 4.1.9, but package.json says 4.1.5, pyproject.toml says 0.4.0.
Solution: Understand versioning strategy:
When updating versions:
Prevention: Create release checklist
Problem: Makefile requires uv but users don't have it.
Solution: Install UV:
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# With pip
pip install uv
Alternative: Provide fallback commands:
# With UV (preferred)
uv run pytest
# Without UV (fallback)
python -m pytest
Prevention: Document UV requirement in README
1. Use pytest markers for organization:
@pytest.mark.unit
def test_individual_function():
pass
@pytest.mark.integration
def test_component_interaction():
pass
@pytest.mark.confidence_check
def test_with_pre_check(confidence_checker):
pass
2. Use fixtures for shared setup:
# conftest.py
@pytest.fixture
def sample_context():
return {...}
# test_file.py
def test_feature(sample_context):
# Use sample_context
3. Test both happy path and edge cases:
def test_feature_success():
# Normal operation
def test_feature_with_empty_input():
# Edge case
def test_feature_with_invalid_data():
# Error handling
1. Conventional commits:
git commit -m "feat: add confidence checking to PM Agent"
git commit -m "fix: resolve version inconsistency"
git commit -m "docs: update CLAUDE.md with plugin warnings"
git commit -m "test: add unit tests for reflexion pattern"
2. Small, focused commits:
3. Branch naming:
feature/add-confidence-check
fix/version-inconsistency
docs/update-readme
refactor/simplify-cli
test/add-unit-tests
1. Code documentation:
def assess(self, context: Dict[str, Any]) -> float:
"""
Assess confidence level (0.0 - 1.0)
Investigation Phase Checks:
1. No duplicate implementations? (25%)
2. Architecture compliance? (25%)
3. Official documentation verified? (20%)
4. Working OSS implementations referenced? (15%)
5. Root cause identified? (15%)
Args:
context: Context dict with task details
Returns:
float: Confidence score (0.0 = no confidence, 1.0 = absolute certainty)
Example:
>>> checker = ConfidenceChecker()
>>> confidence = checker.assess(context)
>>> if confidence >= 0.9:
... proceed_with_implementation()
"""
2. README structure:
3. Keep docs synchronized with code:
Symptoms:
$ uv run pytest
ERROR: file or directory not found: tests/
Cause: tests/ directory doesn't exist
Solution:
# Create tests structure
mkdir -p tests/unit tests/integration
# Add __init__.py files
touch tests/__init__.py
touch tests/unit/__init__.py
touch tests/integration/__init__.py
# Add conftest.py
touch tests/conftest.py
Symptoms:
$ uv run pytest --trace-config
# superclaude not listed in plugins
Cause: Package not installed or entry point not configured
Solution:
# Reinstall in editable mode
uv pip install -e ".[dev]"
# Verify entry point in pyproject.toml
# Should have:
# [project.entry-points.pytest11]
# superclaude = "superclaude.pytest_plugin"
# Test plugin loaded
uv run pytest --trace-config 2>&1 | grep superclaude
Symptoms:
ImportError: No module named 'superclaude'
Cause: Package not installed in test environment
Solution:
# Install package in editable mode
uv pip install -e .
# Or use uv run (creates venv automatically)
uv run pytest
Symptoms:
fixture 'confidence_checker' not found
Cause: pytest plugin not loaded or fixture not defined
Solution:
# Check plugin loaded
uv run pytest --fixtures | grep confidence_checker
# Verify pytest_plugin.py has fixture
# Should have:
# @pytest.fixture
# def confidence_checker():
# return ConfidenceChecker()
# Reinstall package
uv pip install -e .
Symptoms: Files listed in .gitignore still tracked by git
Cause: Files were tracked before adding to .gitignore
Solution:
# Remove from git but keep in filesystem
git rm --cached <file>
# OR remove entire directory
git rm -r --cached <directory>
# Commit the change
git commit -m "fix: remove tracked files from gitignore"
@pytest.fixture
def token_budget(request):
"""Fixture that adapts based on test markers"""
marker = request.node.get_closest_marker("complexity")
complexity = marker.args[0] if marker else "medium"
return TokenBudgetManager(complexity=complexity)
# Usage
@pytest.mark.complexity("simple")
def test_simple_feature(token_budget):
assert token_budget.limit == 200
def pytest_runtest_setup(item):
"""Skip tests if confidence is too low"""
marker = item.get_closest_marker("confidence_check")
if marker:
checker = ConfidenceChecker()
context = build_context(item)
confidence = checker.assess(context)
if confidence < 0.7:
pytest.skip(f"Confidence too low: {confidence:.0%}")
def pytest_runtest_makereport(item, call):
"""Record failed tests for future learning"""
if call.when == "call" and call.excinfo is not None:
reflexion = ReflexionPattern()
error_info = {
"test_name": item.name,
"error_type": type(call.excinfo.value).__name__,
"error_message": str(call.excinfo.value),
}
reflexion.record_error(error_info)
Based on real usage data:
| Task Type | Typical Tokens | With PM Agent | Savings |
|---|---|---|---|
| Typo fix | 200-500 | 200-300 | 40% |
| Bug fix | 2,000-5,000 | 1,000-2,000 | 50% |
| Feature | 10,000-50,000 | 5,000-15,000 | 60% |
| Wrong direction | 50,000+ | 100-200 (prevented) | 99%+ |
Key insight: Prevention (confidence check) saves more tokens than optimization
| Operation | Sequential | Parallel | Speedup |
|---|---|---|---|
| 5 file reads | 15s | 3s | 5x |
| 10 file reads | 30s | 3s | 10x |
| 20 file edits | 60s | 15s | 4x |
| Mixed ops | 45s | 12s | 3.75x |
Key insight: Parallel execution has diminishing returns after ~10 operations per wave
What happened: README described v2.0 plugin system that didn't exist in v4.1.9
Impact: Users spent hours trying to install non-existent features
Solution:
Prevention: Documentation review checklist in release process
What happened: Three different version numbers across files
Impact: Confusion about which version is installed
Solution:
Prevention: Single-source-of-truth for versions (maybe use bumpversion)
What happened: Framework provided testing tools but had no tests itself
Impact: No confidence in code quality, regression bugs
Solution:
Prevention: Make tests a requirement in PR template
Ideas worth investigating:
When stuck:
When sharing knowledge:
Claude Code provides 60+ built-in commands, 28 hook events, a full skills system, 5 settings scopes, agent teams, plan mode, extended thinking, and 60+ MCP servers in its registry. SuperClaude currently uses only a fraction of these.
1. Skills System (CRITICAL)
model, effort, allowed-tools, context: fork, auto-triggering via description, and argument substitution2. Hooks System (HIGH)
SessionStart, Stop, PostToolUse, TaskCompleted, SubagentStop, PreCompact, etc.)SessionStart for PM Agent auto-restore, Stop for session persistence, PostToolUse for self-check, TaskCompleted for reflexion3. Plan Mode Integration (MEDIUM)
4. Settings Profiles (MEDIUM)
Bash(pattern), Edit(path), mcp__server__tool).claude/settings.json templates for common workflows~/.claude/commands/sc/~/.claude/agents/ as subagentsSee docs/user-guide/claude-code-integration.md for the complete feature mapping and gap analysis.
This document grows with the project. Everyone who encounters a problem and finds a solution should document it here.
Contributors: SuperClaude development team and community Maintained by: Project maintainers Review frequency: Quarterly or after major insights