COVERAGE_PLAN.md
Generated 2025-03-06 from Codecov
| Metric | Value |
|---|---|
| Current coverage | 48,571 / 76,694 lines = 63.33% |
| Target | 72,859 / 76,694 lines = 95.0% |
| Gap | 24,288 lines need coverage |
| Files >= 95% | 43 / 239 |
| Files < 95% | 196 (27,872 total misses) |
Sorted by uncovered lines (descending):
| Module | Lines | Hits | Miss | Coverage | Priority |
|---|---|---|---|---|---|
channels/ | 14,079 | 8,677 | 5,402 | 61.6% | P0 |
tools/ | 13,445 | 9,407 | 4,038 | 70.0% | P1 |
agent/ | 9,152 | 6,096 | 3,056 | 66.6% | P0 |
setup/ | 3,005 | 462 | 2,543 | 15.4% | P1 |
extensions/ | 3,540 | 1,298 | 2,242 | 36.7% | P0 |
cli/ | 2,834 | 697 | 2,137 | 24.6% | P1 |
history/ | 1,626 | 0 | 1,626 | 0.0% | P0 |
llm/ | 7,029 | 5,776 | 1,253 | 82.2% | P2 |
(root) | 4,122 | 3,121 | 1,001 | 75.7% | P2 |
worker/ | 1,274 | 480 | 794 | 37.7% | P1 |
sandbox/ | 1,615 | 897 | 718 | 55.5% | P2 |
registry/ | 1,588 | 1,107 | 481 | 69.7% | P2 |
db/ | 921 | 441 | 480 | 47.9% | P1 |
workspace/ | 2,006 | 1,584 | 422 | 79.0% | P2 |
orchestrator/ | 1,199 | 795 | 404 | 66.3% | P2 |
config/ | 1,464 | 1,095 | 369 | 74.8% | P2 |
hooks/ | 1,379 | 1,081 | 298 | 78.4% | P2 |
secrets/ | 687 | 407 | 280 | 59.2% | P2 |
skills/ | 1,714 | 1,585 | 129 | 92.5% | P3 |
context/ | 693 | 586 | 107 | 84.6% | P3 |
estimation/ | 467 | 369 | 98 | 79.0% | P3 |
safety/ | 1,424 | 1,337 | 87 | 93.9% | P3 |
evaluation/ | 226 | 152 | 74 | 67.3% | P3 |
pairing/ | 498 | 446 | 52 | 89.6% | P3 |
tunnel/ | 391 | 368 | 23 | 94.1% | P3 |
observability/ | 316 | 307 | 9 | 97.2% | Done |
These files account for the vast majority of the coverage gap:
| File | Lines | Miss | Coverage | Lines to 95% |
|---|---|---|---|---|
src/extensions/manager.rs | 2,404 | 2,083 | 13.3% | 1,962 |
src/setup/wizard.rs | 2,150 | 1,789 | 16.8% | 1,681 |
src/history/store.rs | 1,486 | 1,486 | 0.0% | 1,411 |
src/channels/web/server.rs | 1,985 | 993 | 50.0% | 893 |
src/channels/wasm/wrapper.rs | 2,237 | 934 | 58.2% | 822 |
src/agent/thread_ops.rs | 1,044 | 763 | 26.9% | 710 |
src/cli/tool.rs | 757 | 735 | 2.9% | 697 |
src/setup/channels.rs | 645 | 596 | 7.6% | 563 |
src/agent/commands.rs | 587 | 587 | 0.0% | 557 |
src/main.rs | 740 | 522 | 29.4% | 485 |
src/channels/web/handlers/jobs.rs | 513 | 456 | 11.1% | 430 |
src/tools/builder/core.rs | 524 | 456 | 13.0% | 429 |
src/worker/job.rs | 1,078 | 467 | 56.7% | 413 |
src/channels/web/handlers/chat.rs | 564 | 417 | 26.1% | 388 |
src/tools/wasm/wrapper.rs | 1,005 | 436 | 56.6% | 385 |
src/channels/signal.rs | 1,814 | 472 | 74.0% | 381 |
src/tools/mcp/auth.rs | 472 | 378 | 19.9% | 354 |
src/worker/container.rs | 350 | 330 | 5.7% | 312 |
src/tools/builtin/job.rs | 1,014 | 359 | 64.6% | 308 |
src/cli/mcp.rs | 322 | 319 | 0.9% | 302 |
src/cli/oauth_defaults.rs | 730 | 335 | 54.1% | 298 |
src/llm/nearai_chat.rs | 854 | 340 | 60.2% | 297 |
src/sandbox/container.rs | 407 | 317 | 22.1% | 296 |
src/tools/mcp/client.rs | 341 | 291 | 14.7% | 273 |
src/registry/installer.rs | 765 | 311 | 59.3% | 272 |
src/orchestrator/job_manager.rs | 405 | 270 | 33.3% | 249 |
src/channels/web/handlers/routines.rs | 249 | 249 | 0.0% | 236 |
src/agent/scheduler.rs | 559 | 263 | 53.0% | 235 |
src/tools/wasm/storage.rs | 296 | 243 | 17.9% | 228 |
src/channels/repl.rs | 233 | 233 | 0.0% | 221 |
src/llm/session.rs | 413 | 242 | 41.4% | 221 |
src/worker/claude_bridge.rs | 629 | 247 | 60.7% | 215 |
src/agent/agent_loop.rs | 523 | 234 | 55.2% | 207 |
src/worker/api.rs | 258 | 207 | 19.8% | 194 |
src/sandbox/proxy/http.rs | 307 | 192 | 37.5% | 176 |
src/channels/wasm/storage.rs | 182 | 182 | 0.0% | 172 |
src/cli/registry.rs | 177 | 177 | 0.0% | 168 |
src/llm/reasoning.rs | 1,163 | 219 | 81.2% | 160 |
src/tools/builder/testing.rs | 308 | 174 | 43.5% | 158 |
src/db/postgres.rs | 166 | 166 | 0.0% | 157 |
Pure logic, serialization, and database queries testable in isolation without real infrastructure. Highest coverage gain per unit of effort.
src/history/store.rs -- 0% -> 95% (+1,411 lines)PostgreSQL repository layer (conversations, jobs, actions, LLM calls, estimation
snapshots). Test query construction and result mapping. Can use the libSQL backend
as a real in-memory database or test doubles for the Database trait.
Tests to write:
test_store_conversation_crud -- create, read, update, delete conversationstest_store_job_lifecycle -- insert job, update status through state machinetest_store_action_recording -- record and query job actionstest_store_llm_call_tracking -- insert and aggregate LLM call recordstest_store_estimation_snapshots -- save and retrieve estimation datasrc/history/analytics.rs -- 0% -> 95% (+133 lines)Aggregation queries (JobStats, ToolStats). Test the query builders and result deserialization.
Tests to write:
test_job_stats_aggregation -- verify counts, durations, success ratestest_tool_stats_ranking -- verify tool usage frequency sortingtest_analytics_empty_db -- graceful handling of no datasrc/extensions/manager.rs -- 13.3% -> 95% (+1,962 lines)Largest single file gap. Extension lifecycle orchestration (install, auth, activate, remove), config parsing, and state transitions.
Tests to write:
test_extension_install_from_manifest -- parse manifest, create extension recordtest_extension_auth_flow -- OAuth token setup, credential storagetest_extension_activate_deactivate -- state transitions, tool registrationtest_extension_remove_cleanup -- remove extension, clean up artifactstest_extension_config_validation -- reject invalid configs, handle defaultstest_extension_list_filtering -- filter by status, type, search querytest_extension_capability_check -- verify required capabilities before activationsrc/extensions/discovery.rs -- 27.8% -> 95% (+125 lines)Extension discovery from filesystem and registry.
Tests to write:
test_discover_local_extensions -- scan directory, parse manifeststest_discover_skip_invalid -- gracefully skip malformed extension dirstest_discover_dedup -- handle duplicate extensions across pathssrc/tools/builder/core.rs -- 13% -> 95% (+429 lines)BuildRequirement, SoftwareType, Language types and project scaffolding.
Tests to write:
test_build_requirement_parsing -- deserialize from JSONtest_scaffold_project_structure -- verify generated file treetest_language_detection -- detect language from file extensionstest_software_type_constraints -- validate type-specific requirementssrc/tools/builder/testing.rs -- 43.5% -> 95% (+158 lines)Test harness integration for built tools.
Tests to write:
test_harness_setup_teardown -- lifecycle of test environmenttest_harness_run_tests -- execute tests and capture resultstest_harness_failure_reporting -- verify error details on test failuresrc/tools/mcp/auth.rs -- 19.9% -> 95% (+354 lines)OAuth token management for MCP servers.
Tests to write:
test_token_refresh_on_expiry -- auto-refresh when token expirestest_token_header_injection -- correct Authorization header formattest_token_persistence -- save/load tokens across restartstest_oauth_pkce_flow -- code verifier/challenge generationtest_auth_config_parsing -- parse various auth config formatssrc/tools/mcp/client.rs -- 14.7% -> 95% (+273 lines)JSON-RPC client for MCP protocol.
Tests to write:
test_jsonrpc_request_serialization -- correct JSON-RPC 2.0 formattest_jsonrpc_response_parsing -- handle success, error, and batch responsestest_jsonrpc_error_codes -- map MCP error codes to ToolErrortest_tool_list_discovery -- parse tools/list responsetest_tool_call_roundtrip -- serialize call, parse resultsrc/tools/wasm/storage.rs -- 17.9% -> 95% (+228 lines)WASM tool persistence (store, load, delete, list).
Tests to write:
test_wasm_tool_store_roundtrip -- store and retrieve tool binary + metadatatest_wasm_tool_delete -- remove tool and verify gonetest_wasm_tool_list_filtering -- filter by name, capabilitytest_wasm_tool_update_metadata -- update without re-uploading binarysrc/tools/wasm/wrapper.rs -- 56.6% -> 95% (+385 lines)Tool trait wrapper for WASM modules.
Tests to write:
test_wasm_param_marshalling -- JSON params to WASM component model typestest_wasm_output_conversion -- WASM return values to ToolOutputtest_wasm_error_propagation -- WASM traps to ToolErrortest_wasm_fuel_exhaustion -- verify fuel limit enforcementtest_wasm_memory_limit -- verify memory ceilingsrc/tools/wasm/loader.rs -- 62.4% -> 95% (+156 lines)WASM tool discovery from filesystem.
Tests to write:
test_loader_scan_directory -- find .wasm files with capabilities.jsontest_loader_skip_invalid -- skip files without valid WIT exportstest_loader_cache_invalidation -- reload when file changessrc/tools/builtin/job.rs -- 64.6% -> 95% (+308 lines)Job management tools (CreateJob, ListJobs, JobStatus, CancelJob).
Tests to write:
test_create_job_params -- validate required/optional parameterstest_list_jobs_formatting -- verify output structuretest_job_status_transitions -- query status at each statetest_cancel_job_running -- cancel an in-progress jobtest_cancel_job_completed -- error on already-completed jobsrc/secrets/store.rs -- 48.1% -> 95% (+145 lines)Encrypted secret storage.
Tests to write:
test_secret_store_roundtrip -- store encrypted, retrieve decryptedtest_secret_update -- overwrite existing secrettest_secret_delete -- remove and verify inaccessibletest_secret_list_redacted -- list shows names but not valuessrc/llm/session.rs -- 41.4% -> 95% (+221 lines)Session token management with auto-renewal.
Tests to write:
test_session_token_parsing -- parse sess_xxx formattest_session_expiry_detection -- detect expired tokenstest_session_auto_renewal -- trigger renewal before expirytest_session_concurrent_renewal -- only one renewal in flightsrc/llm/nearai_chat.rs -- 60.2% -> 95% (+297 lines)NEAR AI Chat Completions provider.
Tests to write:
test_nearai_request_building -- correct endpoint, headers, bodytest_nearai_response_parsing -- parse streaming and non-streaming responsestest_nearai_tool_message_flattening -- tool messages flattened to texttest_nearai_auth_modes -- session token vs API key authtest_nearai_error_handling -- rate limits, auth failures, server errorssrc/llm/mod.rs -- 53.7% -> 95% (+112 lines)Provider factory and backend selection.
Tests to write:
test_provider_factory_nearai -- select NEAR AI from configtest_provider_factory_openai -- select OpenAI from configtest_provider_factory_ollama -- select Ollama from configtest_provider_factory_invalid -- error on unknown backendsrc/llm/reasoning.rs -- 81.2% -> 95% (+160 lines)Planning, tool selection, evaluation logic.
Tests to write:
test_reasoning_step_parsing -- parse planning steps from LLM outputtest_tool_selection_scoring -- rank tools by relevancetest_evaluation_rubric -- score completions against criteriatest_reasoning_with_no_tools -- handle tool-less responsessrc/db/postgres.rs -- 0% -> 95% (+157 lines)PostgreSQL backend delegation to Store + Repository.
Tests to write:
test_postgres_backend_delegates -- verify delegation pattern (trait-level)test_postgres_connection_config -- TLS, pool size, timeout parsingsrc/workspace/mod.rs -- 75.9% -> 95% (+109 lines)Memory operations (write, read, search, tree).
Tests to write:
test_workspace_write_read -- write document, read it backtest_workspace_search_hybrid -- FTS + vector search via RRFtest_workspace_tree -- directory listing of memory filesystemtest_workspace_overwrite -- update existing documentsrc/workspace/embeddings.rs -- 35.1% -> 95% (~100 lines)Embedding provider abstraction.
Tests to write:
test_embedding_dimension_handling -- verify dimension configtest_embedding_batch_processing -- batch multiple chunkstest_embedding_provider_fallback -- graceful degradation when unavailableEnd-to-end tests that exercise the agent loop, worker, scheduler, and dispatcher
by replaying LLM traces through TestRig (see tests/support/test_rig.rs). Each
trace test covers multiple modules simultaneously, making them high-leverage.
Each trace test needs:
tests/fixtures/llm_traces/tests/ using TestRigBuilderCovers: agent/thread_ops.rs (+710 lines)
Test thread creation, listing, switching, and deletion via trace replay.
Fixture: thread_operations.json
Tests:
test_thread_create_and_switch -- create thread, switch to it, verify contexttest_thread_list -- list all threads, verify metadatatest_thread_delete -- delete thread, verify removaltest_thread_switch_nonexistent -- error handling for missing threadCovers: agent/commands.rs (+557 lines)
Test slash commands through the agent loop.
Fixture: agent_commands.json
Tests:
test_command_help -- /help returns command listtest_command_clear -- /clear resets conversationtest_command_compact -- /compact triggers summarizationtest_command_undo_redo -- /undo then /redo restores statetest_command_status -- /status shows agent stateCovers: worker/job.rs (+413 lines), agent/agent_loop.rs (+207 lines)
Test multi-turn tool calling, error recovery, and completion flows.
Fixture: worker_multi_turn.json
Tests:
test_worker_sequential_tools -- call tool A, then tool B based on A's resulttest_worker_tool_error_recovery -- tool fails, agent retries or adaptstest_worker_max_turns -- verify turn limit enforcementCovers: agent/scheduler.rs (+235 lines)
Test parallel job dispatch and completion tracking.
Fixture: scheduler_parallel.json
Tests:
test_scheduler_parallel_dispatch -- dispatch 3 jobs, all completetest_scheduler_job_dependency -- job B waits for job Atest_scheduler_stuck_detection -- detect and recover stuck jobCovers: agent/dispatcher.rs (+153 lines)
Test skill-aware routing and tool attenuation.
Fixture: dispatcher_skills.json
Tests:
test_dispatcher_skill_match -- match message to skill, inject prompttest_dispatcher_tool_attenuation -- installed skill loses dangerous toolstest_dispatcher_no_skill -- fallback when no skill matchesCovers: agent/routine_engine.rs (~80 lines), agent/routine.rs (~40 lines)
Test cron tick and event-triggered routine execution.
Fixture: routine_execution.json
Tests:
test_routine_cron_trigger -- routine fires on scheduletest_routine_event_trigger -- routine fires on matching eventtest_routine_guardrails -- routine respects policy constraintsCovers: agent/compaction.rs (~50 lines), agent/context_monitor.rs (~30 lines)
Test turn summarization and memory pressure detection.
Fixture: compaction_flow.json
Tests:
test_compaction_triggers_at_threshold -- summarize when context exceeds limittest_compaction_preserves_recent -- keep recent turns intacttest_context_pressure_warning -- emit warning at high usageCovers: tools/builtin/job.rs (+308 lines), tools/builtin/skill_tools.rs (+110 lines)
Test job and skill management tools through agent execution.
Fixture: job_and_skill_tools.json
Tests:
test_create_and_list_jobs -- create job, list shows ittest_job_status_query -- query status of running jobtest_skill_list_and_search -- list local skills, search registryCovers: tools/builtin/memory.rs (~20 lines), workspace/ (+109 lines)
Test memory operations through agent tool calls.
Fixture: memory_tools.json
Tests:
test_memory_write_and_search -- write doc, search finds ittest_memory_read_by_path -- read specific documenttest_memory_tree -- list memory filesystem structureCovers: tools/builtin/extension_tools.rs (~40 lines)
Test extension lifecycle via agent tool calls.
Fixture: extension_management.json
Tests:
test_extension_install_via_tool -- agent installs an extensiontest_extension_auth_via_tool -- agent configures authtest_extension_activate_via_tool -- agent activates extensionCovers: agent/self_repair.rs (~40 lines)
Test stuck job detection and recovery.
Fixture: self_repair.json
Tests:
test_stuck_job_detected -- job stuck for > threshold triggers repairtest_stuck_job_recovered -- recovery restarts job successfullytest_stuck_job_fails_permanently -- recovery fails, job marked failedCovers: agent/heartbeat.rs (+80 lines)
Test periodic proactive execution.
Fixture: heartbeat.json
Tests:
test_heartbeat_periodic_fire -- heartbeat triggers at intervaltest_heartbeat_reads_checklist -- reads HEARTBEAT.md, processes itemstest_heartbeat_notification -- sends notification on findingsTest HTTP handlers and SSE/WS endpoints using axum_test or
tower::ServiceExt::oneshot with a real router and in-memory database.
src/channels/web/server.rs -- 50% -> 95% (+893 lines)The single biggest web gap. 40+ API endpoints.
Tests to write:
test_api_health -- GET /health returns 200test_api_chat_submit -- POST /api/chat sends messagetest_api_jobs_list -- GET /api/jobs returns job listtest_api_jobs_create -- POST /api/jobs creates jobtest_api_routines_crud -- full CRUD cycle for routinestest_api_settings_get_set -- GET/PUT settingstest_api_memory_search -- POST /api/memory/searchtest_api_extensions_list -- GET /api/extensionstest_api_skills_list -- GET /api/skillstest_api_sse_connect -- SSE stream connects and receives eventstest_api_auth_required -- endpoints reject missing/bad tokenstest_api_cors_headers -- verify CORS configurationsrc/channels/web/handlers/chat.rs -- 26.1% -> 95% (+388 lines)Chat message submission and SSE streaming.
Tests to write:
test_chat_submit_message -- submit message, receive responsetest_chat_sse_stream -- verify SSE event formattest_chat_thread_context -- messages scoped to threadtest_chat_invalid_payload -- reject malformed requestssrc/channels/web/handlers/jobs.rs -- 11.1% -> 95% (+430 lines)Job CRUD endpoints.
Tests to write:
test_jobs_list_empty -- empty list returns []test_jobs_create_and_get -- create, then GET by IDtest_jobs_cancel -- cancel running jobtest_jobs_filter_by_status -- filter by pending/running/completedtest_jobs_pagination -- limit/offset parameterssrc/channels/web/handlers/routines.rs -- 0% -> 95% (+236 lines)Routine CRUD endpoints.
Tests to write:
test_routines_create -- POST creates routinetest_routines_list -- GET lists all routinestest_routines_update -- PUT updates routine configtest_routines_delete -- DELETE removes routinetest_routines_history -- GET history for a routinesrc/channels/web/handlers/extensions.rs -- 0% -> 95% (+129 lines)Extension management endpoints.
Tests to write:
test_extensions_list -- list installed extensionstest_extensions_install -- install from manifest URLtest_extensions_activate -- activate/deactivate toggletest_extensions_remove -- remove installed extensionsrc/channels/web/handlers/memory.rs -- 0% -> 95% (+110 lines)Memory/workspace endpoints.
Tests to write:
test_memory_search -- search returns ranked resultstest_memory_write -- write a documenttest_memory_read -- read by pathtest_memory_tree -- tree returns filesystem structuresrc/channels/web/handlers/settings.rs -- 0% -> 95% (+103 lines)Settings endpoints.
Tests to write:
test_settings_get -- retrieve current settingstest_settings_update -- update individual settingtest_settings_validation -- reject invalid setting valuessrc/channels/web/handlers/static_files.rs -- 0% -> 95% (+97 lines)Static file serving.
Tests to write:
test_static_index_html -- GET / serves index.htmltest_static_css_js -- serve CSS/JS with correct content typestest_static_404 -- missing file returns 404src/channels/wasm/wrapper.rs -- 58.2% -> 95% (+822 lines)WASM channel wrapper (message routing, lifecycle).
Tests to write:
test_wasm_channel_start -- initialize WASM channel moduletest_wasm_channel_message_routing -- route incoming message to WASMtest_wasm_channel_response -- return WASM response to callertest_wasm_channel_error_handling -- handle WASM trap gracefullytest_wasm_channel_lifecycle -- start, process, shutdownsrc/channels/wasm/loader.rs -- 38.1% -> 95% (+141 lines)WASM channel discovery.
Tests to write:
test_channel_loader_scan -- find channel WASM modulestest_channel_loader_validation -- reject invalid modulestest_channel_loader_manifest -- parse channel capabilitiessrc/channels/wasm/storage.rs -- 0% -> 95% (+172 lines)WASM channel state persistence.
Tests to write:
test_channel_storage_save_load -- persist and restore channel statetest_channel_storage_isolation -- per-channel state isolationtest_channel_storage_cleanup -- remove state on channel uninstallsrc/channels/signal.rs -- 74% -> 95% (+381 lines)Signal protocol channel.
Tests to write:
test_signal_message_send -- send encrypted messagetest_signal_message_receive -- decrypt incoming messagetest_signal_attachment_handling -- handle media attachmentstest_signal_group_message -- group chat routingtest_signal_error_handling -- handle connection failuressrc/channels/repl.rs -- 0% -> 95% (+221 lines)Simple REPL channel.
Tests to write:
test_repl_input_parsing -- parse user input linestest_repl_output_formatting -- format agent responsestest_repl_multiline -- handle multi-line inputtest_repl_special_commands -- handle /quit, /helpCLI subcommands can be tested by invoking clap-parsed command structs directly or by calling the handler functions with constructed arguments.
src/cli/tool.rs -- 2.9% -> 95% (+697 lines)Tool CLI (install, list, remove, build).
Tests to write:
test_cli_tool_list -- list installed toolstest_cli_tool_install_local -- install from local .wasm filetest_cli_tool_install_registry -- install from registrytest_cli_tool_remove -- remove installed tooltest_cli_tool_build -- scaffold and build tool projecttest_cli_tool_info -- display tool detailssrc/cli/mcp.rs -- 0.9% -> 95% (+302 lines)MCP server management CLI.
Tests to write:
test_cli_mcp_list -- list configured MCP serverstest_cli_mcp_add -- add MCP server configtest_cli_mcp_remove -- remove MCP server configtest_cli_mcp_tools -- list tools from MCP servertest_cli_mcp_test_connection -- verify MCP server reachablesrc/cli/oauth_defaults.rs -- 54.1% -> 95% (+298 lines)OAuth default configurations.
Tests to write:
test_oauth_defaults_loading -- load default OAuth configstest_oauth_url_construction -- build auth/token URLstest_oauth_scope_merging -- merge requested scopes with defaultstest_oauth_provider_lookup -- lookup by provider namesrc/cli/registry.rs -- 0% -> 95% (+168 lines)Registry CLI commands.
Tests to write:
test_cli_registry_search -- search for packagestest_cli_registry_install -- install package from registrytest_cli_registry_info -- display package detailssrc/cli/status.rs -- 0% -> 95% (+142 lines)Status display commands.
Tests to write:
test_cli_status_gathering -- collect system status infotest_cli_status_formatting -- render status outputtest_cli_status_components -- check individual componentssrc/cli/memory.rs -- 15.5% -> 95% (+138 lines)Memory CLI subcommands.
Tests to write:
test_cli_memory_search -- search workspace from CLItest_cli_memory_write -- write document from CLItest_cli_memory_read -- read document from CLItest_cli_memory_tree -- display memory treesrc/cli/doctor.rs -- 28.7% -> 95% (+115 lines)Diagnostic checks.
Tests to write:
test_doctor_check_database -- verify DB connectivity checktest_doctor_check_llm -- verify LLM provider checktest_doctor_check_tools -- verify tool availability checktest_doctor_report_format -- verify output formatsrc/cli/config.rs -- 36.5% -> 95% (~100 lines)Config CLI subcommands.
Tests to write:
test_cli_config_get -- read config valuetest_cli_config_set -- write config valuetest_cli_config_list -- list all config keystest_cli_config_reset -- reset to defaultsHardest to test: interactive wizards, Docker, process spawning. Strategy: extract pure logic into testable functions, test the interactive parts by injecting mock input.
src/setup/wizard.rs -- 16.8% -> 95% (+1,681 lines)7-step interactive onboarding wizard. Refactor to extract validation functions, step logic, and config generation into testable units.
Tests to write:
test_wizard_step_validation -- each step validates input correctlytest_wizard_config_generation -- generate config from wizard answerstest_wizard_default_values -- verify sensible defaultstest_wizard_skip_completed -- skip already-configured stepstest_wizard_llm_backend_selection -- provider-specific config pathstest_wizard_channel_setup -- channel configuration logicsrc/setup/channels.rs -- 7.6% -> 95% (+563 lines)Channel setup helpers.
Tests to write:
test_channel_setup_defaults -- default channel configurationtest_channel_setup_validation -- reject invalid channel configstest_channel_setup_telegram -- Telegram-specific setup logictest_channel_setup_signal -- Signal-specific setup logictest_channel_setup_webhook -- webhook URL validationsrc/setup/prompts.rs -- 24.8% -> 95% (+147 lines)Terminal prompt utilities.
Tests to write:
test_prompt_select -- selection from listtest_prompt_confirm -- yes/no confirmationtest_prompt_secret -- masked inputtest_prompt_validation -- input validation rulessrc/sandbox/container.rs -- 22.1% -> 95% (+296 lines)Docker container lifecycle. Test command construction without actual Docker.
Tests to write:
test_container_config_to_docker_args -- generate correct docker run argstest_container_volume_mounts -- workspace mount configurationtest_container_env_scrubbing -- sensitive env vars removedtest_container_resource_limits -- CPU/memory limit argstest_container_network_config -- proxy network setupsrc/sandbox/manager.rs -- 59% -> 95% (+114 lines)Sandbox orchestration.
Tests to write:
test_sandbox_policy_enforcement -- policy to container config mappingtest_sandbox_cleanup -- cleanup on job completiontest_sandbox_concurrent_limit -- enforce max concurrent containerssrc/sandbox/proxy/http.rs -- 37.5% -> 95% (+176 lines)HTTP proxy for container network access.
Tests to write:
test_proxy_allowlist_enforcement -- block disallowed domainstest_proxy_credential_injection -- inject auth headerstest_proxy_connect_tunnel -- HTTPS CONNECT method handlingtest_proxy_logging -- request/response loggingsrc/worker/container.rs -- 5.7% -> 95% (+312 lines)Worker execution loop (runs inside containers).
Tests to write:
test_worker_tool_dispatch -- dispatch tool call, return resulttest_worker_llm_interaction -- send prompt, receive responsetest_worker_turn_limit -- enforce max turnstest_worker_error_propagation -- tool error surfaces to agentsrc/worker/claude_bridge.rs -- 60.7% -> 95% (+215 lines)Claude CLI bridge.
Tests to write:
test_claude_command_construction -- build claude CLI commandtest_claude_output_parsing -- parse claude CLI JSON outputtest_claude_error_handling -- handle CLI crashes gracefullytest_claude_config_injection -- inject config dir and modelsrc/worker/api.rs -- 19.8% -> 95% (+194 lines)Worker HTTP client to orchestrator.
Tests to write:
test_worker_api_request_building -- correct endpoint URLs and headerstest_worker_api_response_parsing -- parse orchestrator responsestest_worker_api_auth_token -- bearer token injectiontest_worker_api_retry -- retry on transient failuressrc/main.rs -- 29.4% -> 95% (+485 lines)Entry point and startup. Extract startup logic into testable functions.
Tests to write:
test_cli_arg_parsing -- verify clap argument parsingtest_startup_config_loading -- config from env + filetest_startup_channel_selection -- select channels from configtest_startup_feature_flags -- feature-gated code pathsSmaller files that each need a handful of additional tests.
| File | Lines Needed | Test Focus |
|---|---|---|
src/tools/builtin/skill_tools.rs | 110 | skill_list, skill_search, skill_install, skill_remove |
src/hooks/bundled.rs | 115 | bundled hook execution, hook discovery |
src/registry/installer.rs | 272 | package download, verification, installation |
src/registry/artifacts.rs | 72 | artifact packaging, checksums |
src/orchestrator/job_manager.rs | 249 | container lifecycle, job routing |
src/orchestrator/api.rs | 125 | LLM proxy, event dispatch endpoints |
src/app.rs | 137 | AppBuilder configuration, startup sequence |
src/service.rs | 120 | service lifecycle, signal handling |
src/config/channels.rs | 55 | channel config parsing |
src/config/sandbox.rs | 61 | sandbox config parsing |
src/config/tunnel.rs | 43 | tunnel config parsing |
src/config/mod.rs | 63 | config merging, env override |
src/config/database.rs | 38 | database URL parsing |
src/evaluation/success.rs | 34 | success evaluator logic |
src/evaluation/metrics.rs | 40 | metrics collection |
src/context/manager.rs | 57 | concurrent job context isolation |
src/context/memory.rs | 36 | action recording, conversation memory |
Maximize coverage gain per unit of effort:
| Order | Category | Lines Gained | Effort |
|---|---|---|---|
| 1 | Trace tests (Tier 2) | ~7,000 | Medium (high leverage, each test covers many modules) |
| 2 | Unit tests for 0% files (Tier 1 subset) | ~3,500 | Low (pure logic, no infrastructure) |
| 3 | Web handler tests (Tier 3) | ~4,500 | Medium (axum_test + in-memory DB) |
| 4 | Extension/MCP/WASM unit tests (Tier 1 remainder) | ~3,500 | Medium |
| 5 | CLI subcommand tests (Tier 4) | ~2,100 | Low-Medium |
| 6 | Setup wizard extraction + tests (Tier 5) | ~2,400 | High (requires refactoring) |
| 7 | LLM provider tests (Tier 1 subset) | ~800 | Medium |
| 8 | Remaining small files (Tier 6) | ~2,000 | Low |
--features libsql and use TestRigBuilder from tests/support/axum::test helpers or build the router directlyTraceLlm for the LLM provider, same as trace tests