IronClaw Coverage Plan: 63.3% to 95%

Generated 2025-03-06 from Codecov

Current State

Metric	Value
Current coverage	48,571 / 76,694 lines = 63.33%
Target	72,859 / 76,694 lines = 95.0%
Gap	24,288 lines need coverage
Files >= 95%	43 / 239
Files < 95%	196 (27,872 total misses)

Module Summary

Sorted by uncovered lines (descending):

Module	Lines	Hits	Miss	Coverage	Priority
`channels/`	14,079	8,677	5,402	61.6%	P0
`tools/`	13,445	9,407	4,038	70.0%	P1
`agent/`	9,152	6,096	3,056	66.6%	P0
`setup/`	3,005	462	2,543	15.4%	P1
`extensions/`	3,540	1,298	2,242	36.7%	P0
`cli/`	2,834	697	2,137	24.6%	P1
`history/`	1,626	0	1,626	0.0%	P0
`llm/`	7,029	5,776	1,253	82.2%	P2
`(root)`	4,122	3,121	1,001	75.7%	P2
`worker/`	1,274	480	794	37.7%	P1
`sandbox/`	1,615	897	718	55.5%	P2
`registry/`	1,588	1,107	481	69.7%	P2
`db/`	921	441	480	47.9%	P1
`workspace/`	2,006	1,584	422	79.0%	P2
`orchestrator/`	1,199	795	404	66.3%	P2
`config/`	1,464	1,095	369	74.8%	P2
`hooks/`	1,379	1,081	298	78.4%	P2
`secrets/`	687	407	280	59.2%	P2
`skills/`	1,714	1,585	129	92.5%	P3
`context/`	693	586	107	84.6%	P3
`estimation/`	467	369	98	79.0%	P3
`safety/`	1,424	1,337	87	93.9%	P3
`evaluation/`	226	152	74	67.3%	P3
`pairing/`	498	446	52	89.6%	P3
`tunnel/`	391	368	23	94.1%	P3
`observability/`	316	307	9	97.2%	Done

Top 40 Files by Uncovered Lines

These files account for the vast majority of the coverage gap:

File	Lines	Miss	Coverage	Lines to 95%
`src/extensions/manager.rs`	2,404	2,083	13.3%	1,962
`src/setup/wizard.rs`	2,150	1,789	16.8%	1,681
`src/history/store.rs`	1,486	1,486	0.0%	1,411
`src/channels/web/server.rs`	1,985	993	50.0%	893
`src/channels/wasm/wrapper.rs`	2,237	934	58.2%	822
`src/agent/thread_ops.rs`	1,044	763	26.9%	710
`src/cli/tool.rs`	757	735	2.9%	697
`src/setup/channels.rs`	645	596	7.6%	563
`src/agent/commands.rs`	587	587	0.0%	557
`src/main.rs`	740	522	29.4%	485
`src/channels/web/handlers/jobs.rs`	513	456	11.1%	430
`src/tools/builder/core.rs`	524	456	13.0%	429
`src/worker/job.rs`	1,078	467	56.7%	413
`src/channels/web/handlers/chat.rs`	564	417	26.1%	388
`src/tools/wasm/wrapper.rs`	1,005	436	56.6%	385
`src/channels/signal.rs`	1,814	472	74.0%	381
`src/tools/mcp/auth.rs`	472	378	19.9%	354
`src/worker/container.rs`	350	330	5.7%	312
`src/tools/builtin/job.rs`	1,014	359	64.6%	308
`src/cli/mcp.rs`	322	319	0.9%	302
`src/cli/oauth_defaults.rs`	730	335	54.1%	298
`src/llm/nearai_chat.rs`	854	340	60.2%	297
`src/sandbox/container.rs`	407	317	22.1%	296
`src/tools/mcp/client.rs`	341	291	14.7%	273
`src/registry/installer.rs`	765	311	59.3%	272
`src/orchestrator/job_manager.rs`	405	270	33.3%	249
`src/channels/web/handlers/routines.rs`	249	249	0.0%	236
`src/agent/scheduler.rs`	559	263	53.0%	235
`src/tools/wasm/storage.rs`	296	243	17.9%	228
`src/channels/repl.rs`	233	233	0.0%	221
`src/llm/session.rs`	413	242	41.4%	221
`src/worker/claude_bridge.rs`	629	247	60.7%	215
`src/agent/agent_loop.rs`	523	234	55.2%	207
`src/worker/api.rs`	258	207	19.8%	194
`src/sandbox/proxy/http.rs`	307	192	37.5%	176
`src/channels/wasm/storage.rs`	182	182	0.0%	172
`src/cli/registry.rs`	177	177	0.0%	168
`src/llm/reasoning.rs`	1,163	219	81.2%	160
`src/tools/builder/testing.rs`	308	174	43.5%	158
`src/db/postgres.rs`	166	166	0.0%	157

Tier 1 -- High-Impact Unit Tests (~8,500 lines)

Pure logic, serialization, and database queries testable in isolation without real infrastructure. Highest coverage gain per unit of effort.

`src/history/store.rs` -- 0% -> 95% (+1,411 lines)

PostgreSQL repository layer (conversations, jobs, actions, LLM calls, estimation snapshots). Test query construction and result mapping. Can use the libSQL backend as a real in-memory database or test doubles for the Database trait.

Tests to write:

test_store_conversation_crud -- create, read, update, delete conversations
test_store_job_lifecycle -- insert job, update status through state machine
test_store_action_recording -- record and query job actions
test_store_llm_call_tracking -- insert and aggregate LLM call records
test_store_estimation_snapshots -- save and retrieve estimation data

`src/history/analytics.rs` -- 0% -> 95% (+133 lines)

Aggregation queries (JobStats, ToolStats). Test the query builders and result deserialization.

Tests to write:

test_job_stats_aggregation -- verify counts, durations, success rates
test_tool_stats_ranking -- verify tool usage frequency sorting
test_analytics_empty_db -- graceful handling of no data

`src/extensions/manager.rs` -- 13.3% -> 95% (+1,962 lines)

Largest single file gap. Extension lifecycle orchestration (install, auth, activate, remove), config parsing, and state transitions.

Tests to write:

test_extension_install_from_manifest -- parse manifest, create extension record
test_extension_auth_flow -- OAuth token setup, credential storage
test_extension_activate_deactivate -- state transitions, tool registration
test_extension_remove_cleanup -- remove extension, clean up artifacts
test_extension_config_validation -- reject invalid configs, handle defaults
test_extension_list_filtering -- filter by status, type, search query
test_extension_capability_check -- verify required capabilities before activation

`src/extensions/discovery.rs` -- 27.8% -> 95% (+125 lines)

Extension discovery from filesystem and registry.

Tests to write:

test_discover_local_extensions -- scan directory, parse manifests
test_discover_skip_invalid -- gracefully skip malformed extension dirs
test_discover_dedup -- handle duplicate extensions across paths

`src/tools/builder/core.rs` -- 13% -> 95% (+429 lines)

BuildRequirement, SoftwareType, Language types and project scaffolding.

Tests to write:

test_build_requirement_parsing -- deserialize from JSON
test_scaffold_project_structure -- verify generated file tree
test_language_detection -- detect language from file extensions
test_software_type_constraints -- validate type-specific requirements

`src/tools/builder/testing.rs` -- 43.5% -> 95% (+158 lines)

Test harness integration for built tools.

Tests to write:

test_harness_setup_teardown -- lifecycle of test environment
test_harness_run_tests -- execute tests and capture results
test_harness_failure_reporting -- verify error details on test failure

`src/tools/mcp/auth.rs` -- 19.9% -> 95% (+354 lines)

OAuth token management for MCP servers.

Tests to write:

test_token_refresh_on_expiry -- auto-refresh when token expires
test_token_header_injection -- correct Authorization header format
test_token_persistence -- save/load tokens across restarts
test_oauth_pkce_flow -- code verifier/challenge generation
test_auth_config_parsing -- parse various auth config formats

`src/tools/mcp/client.rs` -- 14.7% -> 95% (+273 lines)

JSON-RPC client for MCP protocol.

Tests to write:

test_jsonrpc_request_serialization -- correct JSON-RPC 2.0 format
test_jsonrpc_response_parsing -- handle success, error, and batch responses
test_jsonrpc_error_codes -- map MCP error codes to ToolError
test_tool_list_discovery -- parse tools/list response
test_tool_call_roundtrip -- serialize call, parse result

`src/tools/wasm/storage.rs` -- 17.9% -> 95% (+228 lines)

WASM tool persistence (store, load, delete, list).

Tests to write:

test_wasm_tool_store_roundtrip -- store and retrieve tool binary + metadata
test_wasm_tool_delete -- remove tool and verify gone
test_wasm_tool_list_filtering -- filter by name, capability
test_wasm_tool_update_metadata -- update without re-uploading binary

`src/tools/wasm/wrapper.rs` -- 56.6% -> 95% (+385 lines)

Tool trait wrapper for WASM modules.

Tests to write:

test_wasm_param_marshalling -- JSON params to WASM component model types
test_wasm_output_conversion -- WASM return values to ToolOutput
test_wasm_error_propagation -- WASM traps to ToolError
test_wasm_fuel_exhaustion -- verify fuel limit enforcement
test_wasm_memory_limit -- verify memory ceiling

`src/tools/wasm/loader.rs` -- 62.4% -> 95% (+156 lines)

WASM tool discovery from filesystem.

Tests to write:

test_loader_scan_directory -- find .wasm files with capabilities.json
test_loader_skip_invalid -- skip files without valid WIT exports
test_loader_cache_invalidation -- reload when file changes

`src/tools/builtin/job.rs` -- 64.6% -> 95% (+308 lines)

Job management tools (CreateJob, ListJobs, JobStatus, CancelJob).

Tests to write:

test_create_job_params -- validate required/optional parameters
test_list_jobs_formatting -- verify output structure
test_job_status_transitions -- query status at each state
test_cancel_job_running -- cancel an in-progress job
test_cancel_job_completed -- error on already-completed job

`src/secrets/store.rs` -- 48.1% -> 95% (+145 lines)

Encrypted secret storage.

Tests to write:

test_secret_store_roundtrip -- store encrypted, retrieve decrypted
test_secret_update -- overwrite existing secret
test_secret_delete -- remove and verify inaccessible
test_secret_list_redacted -- list shows names but not values

`src/llm/session.rs` -- 41.4% -> 95% (+221 lines)

Session token management with auto-renewal.

Tests to write:

test_session_token_parsing -- parse sess_xxx format
test_session_expiry_detection -- detect expired tokens
test_session_auto_renewal -- trigger renewal before expiry
test_session_concurrent_renewal -- only one renewal in flight

`src/llm/nearai_chat.rs` -- 60.2% -> 95% (+297 lines)

NEAR AI Chat Completions provider.

Tests to write:

test_nearai_request_building -- correct endpoint, headers, body
test_nearai_response_parsing -- parse streaming and non-streaming responses
test_nearai_tool_message_flattening -- tool messages flattened to text
test_nearai_auth_modes -- session token vs API key auth
test_nearai_error_handling -- rate limits, auth failures, server errors

`src/llm/mod.rs` -- 53.7% -> 95% (+112 lines)

Provider factory and backend selection.

Tests to write:

test_provider_factory_nearai -- select NEAR AI from config
test_provider_factory_openai -- select OpenAI from config
test_provider_factory_ollama -- select Ollama from config
test_provider_factory_invalid -- error on unknown backend

`src/llm/reasoning.rs` -- 81.2% -> 95% (+160 lines)

Planning, tool selection, evaluation logic.

Tests to write:

test_reasoning_step_parsing -- parse planning steps from LLM output
test_tool_selection_scoring -- rank tools by relevance
test_evaluation_rubric -- score completions against criteria
test_reasoning_with_no_tools -- handle tool-less responses

`src/db/postgres.rs` -- 0% -> 95% (+157 lines)

PostgreSQL backend delegation to Store + Repository.

Tests to write:

test_postgres_backend_delegates -- verify delegation pattern (trait-level)
test_postgres_connection_config -- TLS, pool size, timeout parsing

`src/workspace/mod.rs` -- 75.9% -> 95% (+109 lines)

Memory operations (write, read, search, tree).

Tests to write:

test_workspace_write_read -- write document, read it back
test_workspace_search_hybrid -- FTS + vector search via RRF
test_workspace_tree -- directory listing of memory filesystem
test_workspace_overwrite -- update existing document

`src/workspace/embeddings.rs` -- 35.1% -> 95% (~100 lines)

Embedding provider abstraction.

Tests to write:

test_embedding_dimension_handling -- verify dimension config
test_embedding_batch_processing -- batch multiple chunks
test_embedding_provider_fallback -- graceful degradation when unavailable

Tier 2 -- Trace Tests (~7,000 lines)

End-to-end tests that exercise the agent loop, worker, scheduler, and dispatcher by replaying LLM traces through TestRig (see tests/support/test_rig.rs). Each trace test covers multiple modules simultaneously, making them high-leverage.

Each trace test needs:

A JSON fixture in tests/fixtures/llm_traces/
A test file in tests/ using TestRigBuilder

Trace: Thread Operations

Covers: agent/thread_ops.rs (+710 lines)

Test thread creation, listing, switching, and deletion via trace replay.

Fixture: thread_operations.json Tests:

test_thread_create_and_switch -- create thread, switch to it, verify context
test_thread_list -- list all threads, verify metadata
test_thread_delete -- delete thread, verify removal
test_thread_switch_nonexistent -- error handling for missing thread

Trace: Agent Commands

Covers: agent/commands.rs (+557 lines)

Test slash commands through the agent loop.

Fixture: agent_commands.json Tests:

test_command_help -- /help returns command list
test_command_clear -- /clear resets conversation
test_command_compact -- /compact triggers summarization
test_command_undo_redo -- /undo then /redo restores state
test_command_status -- /status shows agent state

Trace: Worker Multi-Turn Execution

Covers: worker/job.rs (+413 lines), agent/agent_loop.rs (+207 lines)

Test multi-turn tool calling, error recovery, and completion flows.

Fixture: worker_multi_turn.json Tests:

test_worker_sequential_tools -- call tool A, then tool B based on A's result
test_worker_tool_error_recovery -- tool fails, agent retries or adapts
test_worker_max_turns -- verify turn limit enforcement

Trace: Scheduler Parallel Jobs

Covers: agent/scheduler.rs (+235 lines)

Test parallel job dispatch and completion tracking.

Fixture: scheduler_parallel.json Tests:

test_scheduler_parallel_dispatch -- dispatch 3 jobs, all complete
test_scheduler_job_dependency -- job B waits for job A
test_scheduler_stuck_detection -- detect and recover stuck job

Trace: Dispatcher Skill Selection

Covers: agent/dispatcher.rs (+153 lines)

Test skill-aware routing and tool attenuation.

Fixture: dispatcher_skills.json Tests:

test_dispatcher_skill_match -- match message to skill, inject prompt
test_dispatcher_tool_attenuation -- installed skill loses dangerous tools
test_dispatcher_no_skill -- fallback when no skill matches

Trace: Routine Execution

Covers: agent/routine_engine.rs (~80 lines), agent/routine.rs (~40 lines)

Test cron tick and event-triggered routine execution.

Fixture: routine_execution.json Tests:

test_routine_cron_trigger -- routine fires on schedule
test_routine_event_trigger -- routine fires on matching event
test_routine_guardrails -- routine respects policy constraints

Trace: Compaction and Context Pressure

Covers: agent/compaction.rs (~50 lines), agent/context_monitor.rs (~30 lines)

Test turn summarization and memory pressure detection.

Fixture: compaction_flow.json Tests:

test_compaction_triggers_at_threshold -- summarize when context exceeds limit
test_compaction_preserves_recent -- keep recent turns intact
test_context_pressure_warning -- emit warning at high usage

Trace: Job Tool Coverage

Covers: tools/builtin/job.rs (+308 lines), tools/builtin/skill_tools.rs (+110 lines)

Test job and skill management tools through agent execution.

Fixture: job_and_skill_tools.json Tests:

test_create_and_list_jobs -- create job, list shows it
test_job_status_query -- query status of running job
test_skill_list_and_search -- list local skills, search registry

Trace: Memory Tools

Covers: tools/builtin/memory.rs (~20 lines), workspace/ (+109 lines)

Test memory operations through agent tool calls.

Fixture: memory_tools.json Tests:

test_memory_write_and_search -- write doc, search finds it
test_memory_read_by_path -- read specific document
test_memory_tree -- list memory filesystem structure

Trace: Extension Management

Covers: tools/builtin/extension_tools.rs (~40 lines)

Test extension lifecycle via agent tool calls.

Fixture: extension_management.json Tests:

test_extension_install_via_tool -- agent installs an extension
test_extension_auth_via_tool -- agent configures auth
test_extension_activate_via_tool -- agent activates extension

Trace: Self-Repair

Covers: agent/self_repair.rs (~40 lines)

Test stuck job detection and recovery.

Fixture: self_repair.json Tests:

test_stuck_job_detected -- job stuck for > threshold triggers repair
test_stuck_job_recovered -- recovery restarts job successfully
test_stuck_job_fails_permanently -- recovery fails, job marked failed

Trace: Heartbeat

Covers: agent/heartbeat.rs (+80 lines)

Test periodic proactive execution.

Fixture: heartbeat.json Tests:

test_heartbeat_periodic_fire -- heartbeat triggers at interval
test_heartbeat_reads_checklist -- reads HEARTBEAT.md, processes items
test_heartbeat_notification -- sends notification on findings

Tier 3 -- Web/Channel Handler Tests (~4,500 lines)

Test HTTP handlers and SSE/WS endpoints using axum_test or tower::ServiceExt::oneshot with a real router and in-memory database.

`src/channels/web/server.rs` -- 50% -> 95% (+893 lines)

The single biggest web gap. 40+ API endpoints.

Tests to write:

test_api_health -- GET /health returns 200
test_api_chat_submit -- POST /api/chat sends message
test_api_jobs_list -- GET /api/jobs returns job list
test_api_jobs_create -- POST /api/jobs creates job
test_api_routines_crud -- full CRUD cycle for routines
test_api_settings_get_set -- GET/PUT settings
test_api_memory_search -- POST /api/memory/search
test_api_extensions_list -- GET /api/extensions
test_api_skills_list -- GET /api/skills
test_api_sse_connect -- SSE stream connects and receives events
test_api_auth_required -- endpoints reject missing/bad tokens
test_api_cors_headers -- verify CORS configuration

`src/channels/web/handlers/chat.rs` -- 26.1% -> 95% (+388 lines)

Chat message submission and SSE streaming.

Tests to write:

test_chat_submit_message -- submit message, receive response
test_chat_sse_stream -- verify SSE event format
test_chat_thread_context -- messages scoped to thread
test_chat_invalid_payload -- reject malformed requests

`src/channels/web/handlers/jobs.rs` -- 11.1% -> 95% (+430 lines)

Job CRUD endpoints.

Tests to write:

test_jobs_list_empty -- empty list returns []
test_jobs_create_and_get -- create, then GET by ID
test_jobs_cancel -- cancel running job
test_jobs_filter_by_status -- filter by pending/running/completed
test_jobs_pagination -- limit/offset parameters

`src/channels/web/handlers/routines.rs` -- 0% -> 95% (+236 lines)

Routine CRUD endpoints.

Tests to write:

test_routines_create -- POST creates routine
test_routines_list -- GET lists all routines
test_routines_update -- PUT updates routine config
test_routines_delete -- DELETE removes routine
test_routines_history -- GET history for a routine

`src/channels/web/handlers/extensions.rs` -- 0% -> 95% (+129 lines)

Extension management endpoints.

Tests to write:

test_extensions_list -- list installed extensions
test_extensions_install -- install from manifest URL
test_extensions_activate -- activate/deactivate toggle
test_extensions_remove -- remove installed extension

`src/channels/web/handlers/memory.rs` -- 0% -> 95% (+110 lines)

Memory/workspace endpoints.

Tests to write:

test_memory_search -- search returns ranked results
test_memory_write -- write a document
test_memory_read -- read by path
test_memory_tree -- tree returns filesystem structure

`src/channels/web/handlers/settings.rs` -- 0% -> 95% (+103 lines)

Settings endpoints.

Tests to write:

test_settings_get -- retrieve current settings
test_settings_update -- update individual setting
test_settings_validation -- reject invalid setting values

`src/channels/web/handlers/static_files.rs` -- 0% -> 95% (+97 lines)

Static file serving.

Tests to write:

test_static_index_html -- GET / serves index.html
test_static_css_js -- serve CSS/JS with correct content types
test_static_404 -- missing file returns 404

`src/channels/wasm/wrapper.rs` -- 58.2% -> 95% (+822 lines)

WASM channel wrapper (message routing, lifecycle).

Tests to write:

test_wasm_channel_start -- initialize WASM channel module
test_wasm_channel_message_routing -- route incoming message to WASM
test_wasm_channel_response -- return WASM response to caller
test_wasm_channel_error_handling -- handle WASM trap gracefully
test_wasm_channel_lifecycle -- start, process, shutdown

`src/channels/wasm/loader.rs` -- 38.1% -> 95% (+141 lines)

WASM channel discovery.

Tests to write:

test_channel_loader_scan -- find channel WASM modules
test_channel_loader_validation -- reject invalid modules
test_channel_loader_manifest -- parse channel capabilities

`src/channels/wasm/storage.rs` -- 0% -> 95% (+172 lines)

WASM channel state persistence.

Tests to write:

test_channel_storage_save_load -- persist and restore channel state
test_channel_storage_isolation -- per-channel state isolation
test_channel_storage_cleanup -- remove state on channel uninstall

`src/channels/signal.rs` -- 74% -> 95% (+381 lines)

Signal protocol channel.

Tests to write:

test_signal_message_send -- send encrypted message
test_signal_message_receive -- decrypt incoming message
test_signal_attachment_handling -- handle media attachments
test_signal_group_message -- group chat routing
test_signal_error_handling -- handle connection failures

`src/channels/repl.rs` -- 0% -> 95% (+221 lines)

Simple REPL channel.

Tests to write:

test_repl_input_parsing -- parse user input lines
test_repl_output_formatting -- format agent responses
test_repl_multiline -- handle multi-line input
test_repl_special_commands -- handle /quit, /help

Tier 4 -- CLI Tests (~2,100 lines)

CLI subcommands can be tested by invoking clap-parsed command structs directly or by calling the handler functions with constructed arguments.

`src/cli/tool.rs` -- 2.9% -> 95% (+697 lines)

Tool CLI (install, list, remove, build).

Tests to write:

test_cli_tool_list -- list installed tools
test_cli_tool_install_local -- install from local .wasm file
test_cli_tool_install_registry -- install from registry
test_cli_tool_remove -- remove installed tool
test_cli_tool_build -- scaffold and build tool project
test_cli_tool_info -- display tool details

`src/cli/mcp.rs` -- 0.9% -> 95% (+302 lines)

MCP server management CLI.

Tests to write:

test_cli_mcp_list -- list configured MCP servers
test_cli_mcp_add -- add MCP server config
test_cli_mcp_remove -- remove MCP server config
test_cli_mcp_tools -- list tools from MCP server
test_cli_mcp_test_connection -- verify MCP server reachable

`src/cli/oauth_defaults.rs` -- 54.1% -> 95% (+298 lines)

OAuth default configurations.

Tests to write:

test_oauth_defaults_loading -- load default OAuth configs
test_oauth_url_construction -- build auth/token URLs
test_oauth_scope_merging -- merge requested scopes with defaults
test_oauth_provider_lookup -- lookup by provider name

`src/cli/registry.rs` -- 0% -> 95% (+168 lines)

Registry CLI commands.

Tests to write:

test_cli_registry_search -- search for packages
test_cli_registry_install -- install package from registry
test_cli_registry_info -- display package details

`src/cli/status.rs` -- 0% -> 95% (+142 lines)

Status display commands.

Tests to write:

test_cli_status_gathering -- collect system status info
test_cli_status_formatting -- render status output
test_cli_status_components -- check individual components

`src/cli/memory.rs` -- 15.5% -> 95% (+138 lines)

Memory CLI subcommands.

Tests to write:

test_cli_memory_search -- search workspace from CLI
test_cli_memory_write -- write document from CLI
test_cli_memory_read -- read document from CLI
test_cli_memory_tree -- display memory tree

`src/cli/doctor.rs` -- 28.7% -> 95% (+115 lines)

Diagnostic checks.

Tests to write:

test_doctor_check_database -- verify DB connectivity check
test_doctor_check_llm -- verify LLM provider check
test_doctor_check_tools -- verify tool availability check
test_doctor_report_format -- verify output format

`src/cli/config.rs` -- 36.5% -> 95% (~100 lines)

Config CLI subcommands.

Tests to write:

test_cli_config_get -- read config value
test_cli_config_set -- write config value
test_cli_config_list -- list all config keys
test_cli_config_reset -- reset to defaults

Tier 5 -- Setup/Infra Tests (~2,400 lines)

Hardest to test: interactive wizards, Docker, process spawning. Strategy: extract pure logic into testable functions, test the interactive parts by injecting mock input.

`src/setup/wizard.rs` -- 16.8% -> 95% (+1,681 lines)

7-step interactive onboarding wizard. Refactor to extract validation functions, step logic, and config generation into testable units.

Tests to write:

test_wizard_step_validation -- each step validates input correctly
test_wizard_config_generation -- generate config from wizard answers
test_wizard_default_values -- verify sensible defaults
test_wizard_skip_completed -- skip already-configured steps
test_wizard_llm_backend_selection -- provider-specific config paths
test_wizard_channel_setup -- channel configuration logic

`src/setup/channels.rs` -- 7.6% -> 95% (+563 lines)

Channel setup helpers.

Tests to write:

test_channel_setup_defaults -- default channel configuration
test_channel_setup_validation -- reject invalid channel configs
test_channel_setup_telegram -- Telegram-specific setup logic
test_channel_setup_signal -- Signal-specific setup logic
test_channel_setup_webhook -- webhook URL validation

`src/setup/prompts.rs` -- 24.8% -> 95% (+147 lines)

Terminal prompt utilities.

Tests to write:

test_prompt_select -- selection from list
test_prompt_confirm -- yes/no confirmation
test_prompt_secret -- masked input
test_prompt_validation -- input validation rules

`src/sandbox/container.rs` -- 22.1% -> 95% (+296 lines)

Docker container lifecycle. Test command construction without actual Docker.

Tests to write:

test_container_config_to_docker_args -- generate correct docker run args
test_container_volume_mounts -- workspace mount configuration
test_container_env_scrubbing -- sensitive env vars removed
test_container_resource_limits -- CPU/memory limit args
test_container_network_config -- proxy network setup

`src/sandbox/manager.rs` -- 59% -> 95% (+114 lines)

Sandbox orchestration.

Tests to write:

test_sandbox_policy_enforcement -- policy to container config mapping
test_sandbox_cleanup -- cleanup on job completion
test_sandbox_concurrent_limit -- enforce max concurrent containers

`src/sandbox/proxy/http.rs` -- 37.5% -> 95% (+176 lines)

HTTP proxy for container network access.

Tests to write:

test_proxy_allowlist_enforcement -- block disallowed domains
test_proxy_credential_injection -- inject auth headers
test_proxy_connect_tunnel -- HTTPS CONNECT method handling
test_proxy_logging -- request/response logging

`src/worker/container.rs` -- 5.7% -> 95% (+312 lines)

Worker execution loop (runs inside containers).

Tests to write:

test_worker_tool_dispatch -- dispatch tool call, return result
test_worker_llm_interaction -- send prompt, receive response
test_worker_turn_limit -- enforce max turns
test_worker_error_propagation -- tool error surfaces to agent

`src/worker/claude_bridge.rs` -- 60.7% -> 95% (+215 lines)

Claude CLI bridge.

Tests to write:

test_claude_command_construction -- build claude CLI command
test_claude_output_parsing -- parse claude CLI JSON output
test_claude_error_handling -- handle CLI crashes gracefully
test_claude_config_injection -- inject config dir and model

`src/worker/api.rs` -- 19.8% -> 95% (+194 lines)

Worker HTTP client to orchestrator.

Tests to write:

test_worker_api_request_building -- correct endpoint URLs and headers
test_worker_api_response_parsing -- parse orchestrator responses
test_worker_api_auth_token -- bearer token injection
test_worker_api_retry -- retry on transient failures

`src/main.rs` -- 29.4% -> 95% (+485 lines)

Entry point and startup. Extract startup logic into testable functions.

Tests to write:

test_cli_arg_parsing -- verify clap argument parsing
test_startup_config_loading -- config from env + file
test_startup_channel_selection -- select channels from config
test_startup_feature_flags -- feature-gated code paths

Tier 6 -- Remaining Files to 95% (~2,000 lines)

Smaller files that each need a handful of additional tests.

File	Lines Needed	Test Focus
`src/tools/builtin/skill_tools.rs`	110	skill_list, skill_search, skill_install, skill_remove
`src/hooks/bundled.rs`	115	bundled hook execution, hook discovery
`src/registry/installer.rs`	272	package download, verification, installation
`src/registry/artifacts.rs`	72	artifact packaging, checksums
`src/orchestrator/job_manager.rs`	249	container lifecycle, job routing
`src/orchestrator/api.rs`	125	LLM proxy, event dispatch endpoints
`src/app.rs`	137	AppBuilder configuration, startup sequence
`src/service.rs`	120	service lifecycle, signal handling
`src/config/channels.rs`	55	channel config parsing
`src/config/sandbox.rs`	61	sandbox config parsing
`src/config/tunnel.rs`	43	tunnel config parsing
`src/config/mod.rs`	63	config merging, env override
`src/config/database.rs`	38	database URL parsing
`src/evaluation/success.rs`	34	success evaluator logic
`src/evaluation/metrics.rs`	40	metrics collection
`src/context/manager.rs`	57	concurrent job context isolation
`src/context/memory.rs`	36	action recording, conversation memory

Execution Priority

Maximize coverage gain per unit of effort:

Order	Category	Lines Gained	Effort
1	Trace tests (Tier 2)	~7,000	Medium (high leverage, each test covers many modules)
2	Unit tests for 0% files (Tier 1 subset)	~3,500	Low (pure logic, no infrastructure)
3	Web handler tests (Tier 3)	~4,500	Medium (axum_test + in-memory DB)
4	Extension/MCP/WASM unit tests (Tier 1 remainder)	~3,500	Medium
5	CLI subcommand tests (Tier 4)	~2,100	Low-Medium
6	Setup wizard extraction + tests (Tier 5)	~2,400	High (requires refactoring)
7	LLM provider tests (Tier 1 subset)	~800	Medium
8	Remaining small files (Tier 6)	~2,000	Low

Notes

All trace tests require --features libsql and use TestRigBuilder from tests/support/
Web handler tests can use axum::test helpers or build the router directly
CLI tests should call handler functions directly, not shell out to the binary
Setup wizard tests require extracting pure logic from interactive prompts first
Sandbox/container tests should verify command construction, not run Docker
Worker tests can use TraceLlm for the LLM provider, same as trace tests

IronClaw Coverage Plan: 63.3% to 95%

IronClaw Coverage Plan: 63.3% to 95%

Current State

Module Summary

Top 40 Files by Uncovered Lines

Tier 1 -- High-Impact Unit Tests (~8,500 lines)

src/history/store.rs -- 0% -> 95% (+1,411 lines)

src/history/analytics.rs -- 0% -> 95% (+133 lines)

src/extensions/manager.rs -- 13.3% -> 95% (+1,962 lines)

src/extensions/discovery.rs -- 27.8% -> 95% (+125 lines)

src/tools/builder/core.rs -- 13% -> 95% (+429 lines)

src/tools/builder/testing.rs -- 43.5% -> 95% (+158 lines)

src/tools/mcp/auth.rs -- 19.9% -> 95% (+354 lines)

src/tools/mcp/client.rs -- 14.7% -> 95% (+273 lines)

src/tools/wasm/storage.rs -- 17.9% -> 95% (+228 lines)

src/tools/wasm/wrapper.rs -- 56.6% -> 95% (+385 lines)

src/tools/wasm/loader.rs -- 62.4% -> 95% (+156 lines)

src/tools/builtin/job.rs -- 64.6% -> 95% (+308 lines)

src/secrets/store.rs -- 48.1% -> 95% (+145 lines)

src/llm/session.rs -- 41.4% -> 95% (+221 lines)

src/llm/nearai_chat.rs -- 60.2% -> 95% (+297 lines)

src/llm/mod.rs -- 53.7% -> 95% (+112 lines)

src/llm/reasoning.rs -- 81.2% -> 95% (+160 lines)

src/db/postgres.rs -- 0% -> 95% (+157 lines)

src/workspace/mod.rs -- 75.9% -> 95% (+109 lines)

src/workspace/embeddings.rs -- 35.1% -> 95% (~100 lines)

Tier 2 -- Trace Tests (~7,000 lines)

Trace: Thread Operations

Trace: Agent Commands

Trace: Worker Multi-Turn Execution

Trace: Scheduler Parallel Jobs

Trace: Dispatcher Skill Selection

Trace: Routine Execution

Trace: Compaction and Context Pressure

Trace: Job Tool Coverage

Trace: Memory Tools

Trace: Extension Management

Trace: Self-Repair

Trace: Heartbeat

Tier 3 -- Web/Channel Handler Tests (~4,500 lines)

src/channels/web/server.rs -- 50% -> 95% (+893 lines)

src/channels/web/handlers/chat.rs -- 26.1% -> 95% (+388 lines)

src/channels/web/handlers/jobs.rs -- 11.1% -> 95% (+430 lines)

src/channels/web/handlers/routines.rs -- 0% -> 95% (+236 lines)

src/channels/web/handlers/extensions.rs -- 0% -> 95% (+129 lines)

src/channels/web/handlers/memory.rs -- 0% -> 95% (+110 lines)

src/channels/web/handlers/settings.rs -- 0% -> 95% (+103 lines)

src/channels/web/handlers/static_files.rs -- 0% -> 95% (+97 lines)

src/channels/wasm/wrapper.rs -- 58.2% -> 95% (+822 lines)

src/channels/wasm/loader.rs -- 38.1% -> 95% (+141 lines)

src/channels/wasm/storage.rs -- 0% -> 95% (+172 lines)

src/channels/signal.rs -- 74% -> 95% (+381 lines)

src/channels/repl.rs -- 0% -> 95% (+221 lines)

Tier 4 -- CLI Tests (~2,100 lines)

src/cli/tool.rs -- 2.9% -> 95% (+697 lines)

src/cli/mcp.rs -- 0.9% -> 95% (+302 lines)

src/cli/oauth_defaults.rs -- 54.1% -> 95% (+298 lines)

src/cli/registry.rs -- 0% -> 95% (+168 lines)

src/cli/status.rs -- 0% -> 95% (+142 lines)

src/cli/memory.rs -- 15.5% -> 95% (+138 lines)

src/cli/doctor.rs -- 28.7% -> 95% (+115 lines)

src/cli/config.rs -- 36.5% -> 95% (~100 lines)

Tier 5 -- Setup/Infra Tests (~2,400 lines)

src/setup/wizard.rs -- 16.8% -> 95% (+1,681 lines)

src/setup/channels.rs -- 7.6% -> 95% (+563 lines)

src/setup/prompts.rs -- 24.8% -> 95% (+147 lines)

src/sandbox/container.rs -- 22.1% -> 95% (+296 lines)

src/sandbox/manager.rs -- 59% -> 95% (+114 lines)

src/sandbox/proxy/http.rs -- 37.5% -> 95% (+176 lines)

src/worker/container.rs -- 5.7% -> 95% (+312 lines)

src/worker/claude_bridge.rs -- 60.7% -> 95% (+215 lines)

src/worker/api.rs -- 19.8% -> 95% (+194 lines)

src/main.rs -- 29.4% -> 95% (+485 lines)

Tier 6 -- Remaining Files to 95% (~2,000 lines)

Execution Priority

Notes

`src/history/store.rs` -- 0% -> 95% (+1,411 lines)

`src/history/analytics.rs` -- 0% -> 95% (+133 lines)

`src/extensions/manager.rs` -- 13.3% -> 95% (+1,962 lines)

`src/extensions/discovery.rs` -- 27.8% -> 95% (+125 lines)

`src/tools/builder/core.rs` -- 13% -> 95% (+429 lines)

`src/tools/builder/testing.rs` -- 43.5% -> 95% (+158 lines)

`src/tools/mcp/auth.rs` -- 19.9% -> 95% (+354 lines)

`src/tools/mcp/client.rs` -- 14.7% -> 95% (+273 lines)

`src/tools/wasm/storage.rs` -- 17.9% -> 95% (+228 lines)

`src/tools/wasm/wrapper.rs` -- 56.6% -> 95% (+385 lines)

`src/tools/wasm/loader.rs` -- 62.4% -> 95% (+156 lines)

`src/tools/builtin/job.rs` -- 64.6% -> 95% (+308 lines)

`src/secrets/store.rs` -- 48.1% -> 95% (+145 lines)

`src/llm/session.rs` -- 41.4% -> 95% (+221 lines)

`src/llm/nearai_chat.rs` -- 60.2% -> 95% (+297 lines)

`src/llm/mod.rs` -- 53.7% -> 95% (+112 lines)

`src/llm/reasoning.rs` -- 81.2% -> 95% (+160 lines)

`src/db/postgres.rs` -- 0% -> 95% (+157 lines)

`src/workspace/mod.rs` -- 75.9% -> 95% (+109 lines)

`src/workspace/embeddings.rs` -- 35.1% -> 95% (~100 lines)

`src/channels/web/server.rs` -- 50% -> 95% (+893 lines)

`src/channels/web/handlers/chat.rs` -- 26.1% -> 95% (+388 lines)

`src/channels/web/handlers/jobs.rs` -- 11.1% -> 95% (+430 lines)

`src/channels/web/handlers/routines.rs` -- 0% -> 95% (+236 lines)

`src/channels/web/handlers/extensions.rs` -- 0% -> 95% (+129 lines)

`src/channels/web/handlers/memory.rs` -- 0% -> 95% (+110 lines)

`src/channels/web/handlers/settings.rs` -- 0% -> 95% (+103 lines)

`src/channels/web/handlers/static_files.rs` -- 0% -> 95% (+97 lines)

`src/channels/wasm/wrapper.rs` -- 58.2% -> 95% (+822 lines)

`src/channels/wasm/loader.rs` -- 38.1% -> 95% (+141 lines)

`src/channels/wasm/storage.rs` -- 0% -> 95% (+172 lines)

`src/channels/signal.rs` -- 74% -> 95% (+381 lines)

`src/channels/repl.rs` -- 0% -> 95% (+221 lines)

`src/cli/tool.rs` -- 2.9% -> 95% (+697 lines)

`src/cli/mcp.rs` -- 0.9% -> 95% (+302 lines)

`src/cli/oauth_defaults.rs` -- 54.1% -> 95% (+298 lines)

`src/cli/registry.rs` -- 0% -> 95% (+168 lines)

`src/cli/status.rs` -- 0% -> 95% (+142 lines)

`src/cli/memory.rs` -- 15.5% -> 95% (+138 lines)

`src/cli/doctor.rs` -- 28.7% -> 95% (+115 lines)

`src/cli/config.rs` -- 36.5% -> 95% (~100 lines)

`src/setup/wizard.rs` -- 16.8% -> 95% (+1,681 lines)

`src/setup/channels.rs` -- 7.6% -> 95% (+563 lines)

`src/setup/prompts.rs` -- 24.8% -> 95% (+147 lines)

`src/sandbox/container.rs` -- 22.1% -> 95% (+296 lines)

`src/sandbox/manager.rs` -- 59% -> 95% (+114 lines)

`src/sandbox/proxy/http.rs` -- 37.5% -> 95% (+176 lines)

`src/worker/container.rs` -- 5.7% -> 95% (+312 lines)

`src/worker/claude_bridge.rs` -- 60.7% -> 95% (+215 lines)

`src/worker/api.rs` -- 19.8% -> 95% (+194 lines)

`src/main.rs` -- 29.4% -> 95% (+485 lines)