packages/cua-driver/rust/docs/dedup-audit.md
Date: 2026-05-23 evening (overnight session)
Branch: cross-platform-dedup-audit (off main @ 7436ba89)
Approach: scan crates/platform-{macos,windows,linux} for function signatures + struct definitions that appear in 2+ platforms with the same shape, rank by ROI (lines saved vs refactoring risk).
Two cross-platform crates already in the workspace:
| Crate | What it owns | Notes |
|---|---|---|
mcp-server | Tool trait, ToolDef, page tool + PageBackend trait, shared CDP helper, MCP protocol structs | PR #1666 added the cross-platform page tool. Right home for tool-side shared logic. |
cursor-overlay | CursorConfig / CursorInstanceConfig / CursorRegistry / Palette / MotionConfig / Shape / PathPlanner / Bezier / rotate_toward / crop_png_to_jpeg | PR #1662 already extracted rotate_toward. Right home for cursor-overlay shared primitives. |
The duplication:
| Function | macOS | Windows | Linux |
|---|---|---|---|
png_bytes_to_jpeg(png_bytes, quality) | ✓ 24 lines | ✓ 13 lines | ✓ 13 lines |
resize_png_if_needed(png_bytes, max_dim) | ✓ 45 lines | ✓ 22 lines | ✓ 20 lines |
crosshair_png_bytes(png_bytes, cx, cy) | ✓ 10 lines | ✓ 34 lines | ✓ varies |
write_crosshair_png(...) (macOS only — wraps crosshair_png_bytes + writes to disk) | ✓ | — | — |
png_dimensions(data) | ✓ 12 lines | ✓ 8 lines | ✓ 9 lines |
write_uncompressed_png(rgba, w, h) | (uses CGImage) | ✓ 44 lines | ✓ 20 lines |
write_png_chunk(out, name, data) | (CGImage) | ✓ 8 lines | ✓ 6 lines |
zlib_store(data) | (CGImage) | ✓ 23 lines | ✓ 18 lines |
adler32(data) | (CGImage) | ✓ 9 lines | ✓ 5 lines |
crc32_ieee(name, data) | (CGImage) | ✓ ~25 lines | ? |
What stays per-platform:
screenshot_window_bytes(window_id_or_hwnd_or_xid) -> Vec<u8> — calls CGImage on macOS / BitBlt+PrintWindow on Windows / XGetImage on Linuxscreenshot_display_bytes() -> Vec<u8> — sameEstimated savings: ~400 lines across 3 platforms. Pure refactor with zero behavioural change. The image crate is already a workspace dependency (used by resize_png_if_needed's lanczos resampler).
Extraction target: new module mcp_server::image_utils (in crates/mcp-server/src/image_utils.rs). All callers already depend on mcp-server.
Risk: very low — pure functions, no platform-specific deps, no FFI.
The duplication:
| Item | macOS overlay.rs | Windows overlay.rs | Linux overlay.rs |
|---|---|---|---|
struct RenderState { ... } | 42 fields | 24 fields | 22 fields |
struct Spring { ox, oy, vx, vy } | identical | identical | identical |
impl RenderState::new(cfg) | ~25 lines | ~25 lines | ~25 lines |
impl RenderState::tick(dt) | ~50 lines (path planner + spring physics) | ~50 lines (same) | ~50 lines (same) |
impl RenderState::apply_command(cmd) | ~80 lines (OverlayCommand match arms) | ~80 lines (same) | ~80 lines (same) |
fn render_frame(rs) -> Pixmap | ~135 lines tiny_skia | ~95 lines (same logic, fewer features) | ~78 lines (same) |
fn draw_default_arrow(...) | ~175 lines tiny_skia path | ~63 lines (same path, simpler) | ~52 lines (same) |
fn rotate_toward | ✓ already extracted to cursor-overlay | ✓ already extracted | ✓ already extracted |
What stays per-platform:
dispatch_set_layer_contents (macOS) / update_layered_window (Windows) / paint_x11 (Linux) — platform-specific paint plumbingEstimated savings: ~1800 lines across 3 platforms (~600 per platform). This is the biggest absolute win in the codebase.
Extraction target: cursor-overlay crate. Add RenderState + Spring + tick + apply_command + render_frame + draw_default_arrow to cursor-overlay::lib. Each platform's overlay.rs becomes a thin "platform paint loop" wrapper that:
cursor_overlay::RenderState::tick(dt) → cursor_overlay::render_frame(rs) → platform paintRisk: medium. The RenderState field sets differ slightly per platform (virt_x/y/w/h on Windows are Win32-DIP-specific; macOS uses CGFloat in NSScreen coords). Need to either:
RenderState with platform-agnostic virt_bounds: (f64, f64, f64, f64) and let each platform convertcore: RenderStateCore (shared) + platform: WindowsRenderState { core, virt_w_dip } (specific)Suggested approach: the split — RenderStateCore shared, per-platform wrapper structs that hold the platform-specific extras.
The duplication:
| Module | Size | Pattern |
|---|---|---|
platform-macos/src/ax/cache.rs | 82 lines | Per-pid LRU keyed on (pid, window_id) holding Vec<AXElement> |
platform-windows/src/uia/cache.rs | 78 lines | Same pattern, holds Vec<IUIAutomationElement> (refcounted) |
platform-linux/src/atspi/cache.rs | 45 lines | Same pattern, holds AT-SPI accessibles |
Estimated savings: ~150 lines across 3 platforms.
Extraction target: mcp-server::element_cache — generic ElementCache<T> over the platform-specific element type. Each platform inserts its own T.
Risk: medium — the cache's lifetime semantics differ (macOS AXUIElementRef is CFType-refcounted; Windows IUIAutomationElement is COM-refcounted; Linux uses raw pointers). Generic-T cache handles this cleanly with Drop on T.
Every tool in every platform repeats:
let pid = match args.get("pid").and_then(|v| v.as_i64()) {
Some(p) => p as i32,
None => return ToolResult::error("Missing required integer field pid."),
};
let window_id = ...;
let element_index = ...;
The duplication: ~30 tools × 3 platforms × ~5 fields per tool = ~450 small repetitions, often differing by 1-2 chars in the error message.
Estimated savings: ~600 lines, but each one is 2-4 lines so individually small.
Extraction target: mcp-server::tool_args helper:
pub trait ArgsExt {
fn pid(&self) -> Result<i32, ToolResult>;
fn window_id(&self) -> Result<u32, ToolResult>;
fn element_index(&self) -> Result<Option<u64>, ToolResult>;
fn javascript(&self) -> Result<String, ToolResult>;
// etc.
}
Risk: very low — pure refactor of argument parsing. Per-platform tools opt into the helper. The CodeRabbit fix on PR #1666 (i32::try_from instead of as i32) becomes free for every caller.
Each platform implements its own list_apps / list_windows. The RESPONSE shape (the JSON fields cua-driver returns) is normalized via the WindowInfo / AppInfo structs — but those structs themselves are duplicated per platform crate.
Estimated savings: ~50 lines for the struct definitions; doesn't dedupe the actual list_* functions.
Risk: high — WindowInfo on Windows carries an hwnd: u64, on macOS a window_id: u32, on Linux an xid: u64. The field semantics overlap but the types differ; a unified struct needs a typed wrapper.
Recommendation: skip this one unless the response shape changes warrant it. The boilerplate is small relative to the win.
Each platform's tool has its own description string (already verified in PR #1666). These are intentionally per-platform — they describe per-platform behaviour (Windows uses UIA, macOS uses AX, Linux uses AT-SPI). Don't try to dedupe.
All five candidates landed on cross-platform-dedup-audit (PR #1670).
| PR | What | Estimated | Actual | Status |
|---|---|---|---|---|
| #A | mcp-server::image_utils — capture.rs PNG/JPEG/crosshair/resize helpers | ~400 | −399 | ✓ shipped (9c998916 + 921dcdc9) |
| #1b | cursor-overlay::Spring — physics struct | small | small | ✓ shipped (f10a3258) |
| #B | cursor-overlay::render_state — RenderStateCore + tick + apply_command + render_frame + draw_default_arrow | ~1800 | −947 across overlays / −211 net | ✓ shipped (e0336893) |
| #C | mcp-server::tool_args — ArgsExt helper trait | ~600 | −125 consumers + trait/tests | ✓ shipped (f1c0cca1 + 3824b0d7) |
| #2 | mcp-server::element_cache — generic ElementCacheCore<K, S> | ~150 | +14 net | ✓ shipped (f798e3ad) |
Worth recording for future audits — three places the original estimates were off:
MotionConfig. macOS's MoveTo and ClickPulse have sentinel-snap behaviour the others don't. The shared core resolves this with two tick variants (tick_motion vs tick_swift_constants) and a parameterised apply_command_base(snap_mt, click_only). So RenderStateCore is real, but it's a family of behaviours, not one.start_t: Instant was dead code on Windows + Linux (assigned, never read) — not a real shared field, dropped during extraction.mcp-server, no duplication.installed_apps.rs (Linux) / apps/ (macOS) / win32/apps.rs (Windows) — same shape but completely different impls (XDG desktop files / NSWorkspace / Start Menu shortcuts). Sharing the response struct only.list_apps / list_windows response shape — recommended skip (high risk for ~50 lines).