Back to Fresh

Windows Mouse Input: Research and Implementation

docs/internal/windows-mouse-input.md

0.3.223.2 KB
Original Source

Windows Mouse Input: Research and Implementation

Problem Statement

Mouse input on Windows dumps garbled VT escape sequence fragments (like [<35;50;21M) into the editor buffer as literal text instead of being parsed as mouse events.

Root Cause (Updated March 2026)

The corruption under mode 1003 was originally attributed to a conhost.exe 4KB buffer boundary bug (PR #17738). Further investigation by reading Neovim's and libuv's source code revealed that the 4KB bug applies to a different code path (ConPTY pipe input, not ENABLE_VIRTUAL_TERMINAL_INPUT direct reads). The actual root causes are:

  1. wRepeatCount is ignored in vt_input.rs. When the console input buffer fills under heavy mouse movement, Windows coalesces repeated KEY_EVENT records by setting wRepeatCount > 1. Fresh treats each record as a single byte instead of N bytes, causing the VT byte stream to be shorter than expected. Subsequent sequences start at wrong offsets, and the parser sees garbage. libuv handles this correctly (src/win/tty.c:949: if (--KEV.wRepeatCount > 0) { ... continue; }).

  2. Mode 1003 generates extreme event volume. Each mouse pixel movement produces an SGR sequence like \x1b[<35;120;45M — approximately 15 KEY_EVENT records. Rapid mouse movement generates thousands of records per second. Between read calls (while processing, allocating, sending through mpsc channel), the console input buffer can overflow, causing the terminal emulator to drop events or partially write sequences.

Previous (incorrect) root cause theory

The conhost 4KB buffer boundary bug (PR #17738) was initially blamed. That bug is real but applies to ConPTY pipe input (VtInputThreadStateMachine::ProcessString), not to ENABLE_VIRTUAL_TERMINAL_INPUT direct reads (TerminalInput::HandleKey → per-event VT generation). See "Research: Bundling conpty.dll" section below for details.

References

Investigation Timeline

1. Initial hypothesis: double delivery (WRONG)

We first suspected that setting both ENABLE_MOUSE_INPUT and ENABLE_VIRTUAL_TERMINAL_INPUT caused the console to deliver mouse events twice — as MOUSE_EVENT records AND as VT sequences in KEY_EVENT records, interleaving and corrupting the byte stream.

Disproved by logs: with both flags set, the console generated ZERO MOUSE_EVENT records. All mouse input arrived as VT sequences through KEY_EVENT. The garbling was not from interleaving.

2. Hypothesis: flush_timeout dumping partial sequences (PARTIALLY CORRECT)

The InputParser::flush_timeout() was designed to emit a standalone ESC after 50ms of no input. When a VT mouse sequence was split across two ReadConsoleInputW calls, the \x1b sat in the parser buffer and got flushed as standalone ESC before the [<35;...M continuation arrived.

Fix applied: flush_timeout() now only flushes when buffer is exactly [0x1b] (standalone ESC). Partial CSI sequences (\x1b[...) are never flushed. Additionally, adopted Microsoft Edit's pattern: read_timeout() returns 100ms when ESC is buffered, Duration::MAX otherwise. Only flush ESC after poll confirms no more input pending.

Result: reduced garbage but did not eliminate it.

3. Hypothesis: ENABLE_MOUSE_INPUT conflicts with VT tracking (PARTIALLY CORRECT)

With ENABLE_MOUSE_INPUT set alongside ENABLE_VIRTUAL_TERMINAL_INPUT and VT mouse tracking sequences, the console sometimes corrupted VT mouse sequences by dropping their \x1b[ prefix.

Fix applied: removed ENABLE_MOUSE_INPUT from console mode (matching Microsoft Edit). Mouse events arrive solely as VT sequences through KEY_EVENT records once VT mouse tracking (\x1b[?1003;1006h) is enabled.

Result: reduced but did not eliminate corruption.

4. Hypothesis: crossterm's EnableMouseCapture clobbers console mode (CORRECT)

crossterm's EnableMouseCapture on Windows calls SetConsoleMode(ENABLE_MOUSE_MODE) which replaces the entire console mode with ENABLE_MOUSE_INPUT | ENABLE_EXTENDED_FLAGS | ENABLE_WINDOW_INPUT, removing ENABLE_VIRTUAL_TERMINAL_INPUT. It also does NOT write VT tracking sequences (is_ansi_code_supported returns false on Windows).

Fix applied: skip crossterm's EnableMouseCapture/DisableMouseCapture on Windows (#[cfg(not(windows))]). Mouse tracking is handled entirely by win_vt_input::enable_vt_input() + enable_mouse_tracking().

5. Hypothesis: slow drain causes buffer overflow in console (PARTIALLY CORRECT)

Between ReadConsoleInputExW calls, the editor processes events (rendering, plugins, LSP, async messages). During this time, the console's input buffer accumulates VT mouse events. Under high event rates, the 4KB internal buffer overflows, triggering the split/drop bug.

Fix applied: dedicated reader thread (VtInputReader) continuously drains the console buffer via WaitForSingleObject + ReadConsoleInputExW(CONSOLE_READ_NOWAIT) in a tight loop, sending events through an std::sync::mpsc channel. A Windows Event object is signaled on new data for efficient WaitForMultipleObjects integration.

Result: reduced frequency of corruption significantly but did not fully eliminate it.

6. Root cause suspected: conhost 4KB buffer boundary bug (WRONG)

We initially attributed the remaining corruption to the conhost 4KB buffer boundary bug (PR #17738). This was incorrect — see "Root Cause (Updated)" section above. The 4KB bug applies to ConPTY pipe input, not to ENABLE_VIRTUAL_TERMINAL_INPUT direct reads.

7. Actual root cause: wRepeatCount ignored + mode 1003 event flooding (CORRECT)

Analysis of Neovim's source code and libuv's uv_tty Windows implementation revealed:

  • libuv handles wRepeatCount (libuv/src/win/tty.c:949). When Windows coalesces repeated KEY_EVENT records, libuv re-emits the byte output for each repeat count. Fresh's vt_input.rs ignores wRepeatCount entirely, losing bytes from the VT stream.

  • Neovim defaults to mode 1002, not 1003 (src/nvim/tui/tui.c:1346). Mode 1003 is only enabled when mousemoveevent is explicitly set. This reduces the event rate by orders of magnitude, keeping the console input buffer from overflowing.

  • Neovim does NOT use ConPTY self-hosting for its own TUI (see next section). It uses a single process with ENABLE_VIRTUAL_TERMINAL_INPUT, exactly like Fresh's vt_input.rs.

Fix required: handle wRepeatCount in vt_input.rs and add corrupt sequence detection.

Final Solution

Console mode (matching Microsoft Edit)

ENABLE_VIRTUAL_TERMINAL_INPUT | ENABLE_WINDOW_INPUT

NO ENABLE_MOUSE_INPUT, NO ENABLE_EXTENDED_FLAGS (the latter disables Quick Edit mode and stays disabled after exit if cleanup doesn't restore it). Mouse events arrive as VT sequences in KEY_EVENT records.

Mouse tracking mode

\x1b[?1003;1006h — All Motion Mouse Tracking (1003) + SGR Mouse Mode (1006).

Mode 1003 generates high event volume under rapid mouse movement. The dedicated reader thread and corrupt sequence detection (strip_corrupt_mouse) mitigate this. Neovim and Microsoft Edit default to mode 1002 (cell-motion) to avoid the volume, but Fresh uses 1003 for full hover/mousemove support.

ReadConsoleInputW

Standard ReadConsoleInputW after WaitForSingleObject on the stdin handle. 16384-entry INPUT_RECORD buffer.

Dedicated reader thread

VtInputReader spawns a background thread that continuously drains the console buffer. While the conhost bug isn't fully preventable, fast draining reduces its frequency. The thread signals a Windows Event object for efficient multiplexing.

InputParser changes

  • flush_timeout() removed — ESC is never flushed by timeout
  • A lone ESC stays in the buffer until the next byte disambiguates it: [ → CSI sequence, O → SS3, another \x1b → standalone ESC, anything else → Alt+key
  • This prevents mouse sequences split across ReadConsoleInputW batches from having their ESC consumed as a standalone keypress

crossterm bypass

On Windows, crossterm's EnableMouseCapture/DisableMouseCapture are skipped entirely. They clobber the console mode and don't write VT sequences. Mouse handling is done by win_vt_input module.

Architecture

Both direct mode and client/server relay mode share the same code:

win_vt_input::enable_vt_input()     → set console mode (no ENABLE_MOUSE_INPUT)
win_vt_input::enable_mouse_tracking() → write \x1b[?1003;1006h to stdout
VtInputReader::spawn()               → background thread drains console buffer
InputParser                          → parse VT bytes into crossterm Events

How Microsoft Edit Handles This

Microsoft Edit (github.com/microsoft/edit) uses the same approach:

  • Console mode: ENABLE_VIRTUAL_TERMINAL_INPUT | ENABLE_WINDOW_INPUT | ENABLE_EXTENDED_FLAGS
  • Mouse tracking: \x1b[?1002;1006;2004h (1002 cell-motion; Fresh uses 1003 instead for hover support)
  • Reads via ReadConsoleInputExW with CONSOLE_READ_NOWAIT
  • VT parser with 100ms ESC timeout, state-machine based
  • No ENABLE_MOUSE_INPUT
  • Explicitly rejects legacy console: "This application does not support the legacy console."
  • Supports bracketed paste via \x1b[?2004h (works through ENABLE_VIRTUAL_TERMINAL_INPUT)

How crossterm Handles This (and why we bypass it)

On Windows, crossterm:

  • enable_raw_mode(): masks off ENABLE_LINE_INPUT | ENABLE_ECHO_INPUT | ENABLE_PROCESSED_INPUT
  • EnableMouseCapture: calls SetConsoleMode(ENABLE_MOUSE_INPUT | ENABLE_EXTENDED_FLAGS | ENABLE_WINDOW_INPUT)replaces entire mode, removes VT input. Returns is_ansi_code_supported() = false so no VT tracking sequences are written.
  • event_read(): uses ReadConsoleInputW to read structured events including MOUSE_EVENT records

This approach is incompatible with VT mouse tracking. crossterm on Windows uses the legacy Win32 mouse API, not VT sequences.

How Neovim Actually Handles Windows Input (Source Code Analysis)

Critical correction: Neovim does NOT use ConPTY self-hosting for its own TUI input. This section was previously titled "Alternative Approach: ConPTY Self-Hosting (Neovim)" based on incorrect assumptions. Analysis of Neovim's actual source code (cloned and read in March 2026) reveals a much simpler architecture that is essentially identical to Fresh's vt_input.rs approach.

Neovim's architecture (single process, no ConPTY for TUI)

Neovim uses ConPTY only for :terminal buffers (hosting child shells). For its own TUI input, Neovim is a single process that reads directly from the console:

  1. ENABLE_VIRTUAL_TERMINAL_INPUT on console stdin (src/nvim/event/stream.c:65-72):

    c
    } else if (type == UV_TTY) {
        uv_tty_init(&loop->uv, &stream->uv.tty, fd, 0);
        uv_tty_set_mode(&stream->uv.tty, UV_TTY_MODE_RAW);
        DWORD dwMode;
        if (GetConsoleMode(stream->uv.tty.handle, &dwMode)) {
            dwMode |= ENABLE_VIRTUAL_TERMINAL_INPUT;
            SetConsoleMode(stream->uv.tty.handle, dwMode);
        }
    
  2. libuv's uv_tty reads one INPUT_RECORD at a time (libuv/src/win/tty.c:768). It stays in a tight loop: ReadConsoleInputW(handle, &record, 1, &count). No allocation, no channel sends between records. The console buffer drains as fast as possible.

  3. libuv handles wRepeatCount (libuv/src/win/tty.c:949):

    c
    if (--KEV.wRepeatCount > 0) {
        handle->tty.rd.last_key_offset = 0;
        continue;
    }
    

    If Windows coalesces events, libuv re-emits the byte output for each repeat.

  4. libuv ignores non-KEY_EVENT records (libuv/src/win/tty.c:788). Mouse events, focus events, etc. are silently dropped. With ENABLE_VIRTUAL_TERMINAL_INPUT, mouse SGR sequences and bracketed paste markers arrive as VT bytes in KEY_EVENT UnicodeChar fields, so they pass through automatically.

  5. Mode 1002 by default (src/nvim/tui/tui.c:1346). Neovim enables kTermModeMouseButtonEvent (mode 1002) + kTermModeMouseSGRExt (mode 1006). Mode 1003 (all-motion) is only enabled when mousemoveevent is explicitly set.

  6. Bracketed paste in the VT parser (src/nvim/tui/input.c:532-571). handle_bracketed_paste() detects \x1b[200~ / \x1b[201~ markers inline in the byte stream. Split sequences (marker split across reads) are handled by returning incomplete=true and waiting for the next read, with a timer fallback (ttimeoutlen). Paste content is sent via nvim_paste RPC with phase tracking (1=first, 2=continue, 3=last).

Why Neovim doesn't have the corruption bug

  • wRepeatCount is handled — no lost bytes from event coalescing
  • Mode 1002 by default — far fewer events than mode 1003
  • Tight read loop — libuv reads one record at a time without leaving the loop, minimizing the window for buffer overflow

The 4KB conhost bug is irrelevant to this approach

The 4KB buffer boundary bug (PR #17738) occurs in conhost's VtInputThreadStateMachine::ProcessString path, which handles ConPTY pipe input. With ENABLE_VIRTUAL_TERMINAL_INPUT, the terminal emulator generates VT sequences directly and delivers them as KEY_EVENT records. The conhost output-side VT generation code is not involved, so the 4KB bug does not apply.

Previous ConPTY self-hosting attempt (abandoned)

Fresh previously implemented a ConPTY self-hosting parent shim (conpty_host.rs) that used ENABLE_MOUSE_INPUT to read structured MOUSE_EVENT records and synthesized SGR VT sequences. This fixed mouse but broke bracketed paste entirely — the parent intercepts input before the terminal emulator's VT layer, so paste markers are never generated. This approach is being removed in favor of fixing the direct VT input path.

References

Crate Extraction: fresh-winterm

All Windows-specific terminal I/O code has been extracted into the fresh-winterm crate (crates/fresh-winterm/). This isolates the console hacks from the editor core and provides a clean boundary for future ConPTY work.

What moved

ModuleFromTo
VT input (console mode, ReadConsoleInputW, VtInputReader)client/win_vt_input.rsfresh-winterm/src/vt_input.rs
Client relay loopclient/relay_windows.rsfresh-winterm/src/relay.rs
Terminal size queryclient/mod.rs (Windows cfg block)fresh-winterm/src/terminal_size.rs

Decoupling via RelayConnection trait

The relay loop previously depended on ClientConnection, ClientControl, and ServerControl from the editor crate. These are replaced by a RelayConnection trait in fresh-winterm:

rust
pub trait RelayConnection {
    fn try_read_data(&mut self, buf: &mut [u8]) -> io::Result<usize>;
    fn try_read_control_byte(&mut self, buf: &mut [u8; 1]) -> io::Result<usize>;
    fn write_data(&mut self, buf: &[u8]) -> io::Result<()>;
    fn send_resize(&mut self, cols: u16, rows: u16) -> io::Result<()>;
    fn handle_server_control(&mut self, msg: &str) -> Option<RelayExitReason>;
}

The editor crate implements this trait on ClientConnection (~20 lines). All protocol serialization (serde_json, ClientControl/ServerControl) stays in the editor crate. fresh-winterm depends only on windows-sys and tracing.

Why this matters

All Windows console complexity is isolated in fresh-winterm. The editor crate calls high-level functions (enable_vt_input, VtInputReader::spawn) without knowing about console modes, INPUT_RECORD processing, or Windows API details. The ConPTY self-hosting code (conpty_host.rs, mouse_sgr.rs) can be removed from fresh-winterm once the direct VT input path is fixed.

Research: Bundling conpty.dll Does NOT Fix Direct VT Input

Investigation revealed that PR #17738's fix applies to a different code path than what Fresh uses for direct console input:

ConPTY pipe input (what PR #17738 fixes)ENABLE_VIRTUAL_TERMINAL_INPUT (what Fresh uses)
Code pathVtInputThread -> StateMachine::ProcessStringTerminalInput::HandleKey -> per-event VT generation
Data sourcePipe from hosting terminal (4096-byte ReadFile buffer)Keyboard events from OS (no buffer, per-event conversion)
4KB boundary bugYes — fixed by PR #17738No — different code path entirely
Fixed by bundled DLLYesNo — conhost attached at process creation, can't be replaced

Bundling conpty.dll / OpenConsole.exe (as VS Code and WezTerm do) only helps applications that host children through ConPTY. It cannot change the conhost.exe that was attached to Fresh's process at creation time. The corruption Fresh sees under mode 1003 with direct console reads is a different bug (likely console input buffer overflow under high event rates), not the pipe-boundary bug that PR #17738 fixes.

Implication

The 4KB bug is not relevant to Fresh's direct VT input approach. The corruption under mode 1003 is caused by wRepeatCount being ignored and console input buffer overflow under extreme event rates — both fixable without ConPTY.

Appendix: Origin of the ConPTY Self-Hosting Pattern (historical context)

The self-hosting pattern (application spawning itself behind ConPTY) was not invented by Neovim. The pattern evolved from several predecessors. Note: Neovim does NOT use this pattern for its own TUI — it uses ConPTY only for :terminal buffers. Fresh previously attempted this pattern but is removing it in favor of fixing direct VT input.

  1. winpty (2011-2018): The earliest form of self-hosting on Windows. winpty spawned a hidden conhost.exe and used screen-scraping (ReadConsoleOutputW polling) to reverse- engineer VT output. Applications like Git Bash, MinTTY, and early Neovim used winpty to host interactive console programs. The approach was fragile (CPU-intensive polling, font- dependent character width calculations, orphaned agent processes).

  2. ConPTY API (Windows 10 1809, October 2018): Microsoft introduced CreatePseudoConsole as a native replacement for winpty. ConPTY intercepts Win32 Console API calls and translates them to VT sequences on pipes, eliminating the screen-scraping approach.

  3. Neovim PR #11390 (2019): Neovim adopted ConPTY for its embedded terminal emulator (:terminal). The child process runs behind ConPTY, and Neovim's libvterm parses the VT output. This was a direct replacement for the winpty dependency.

  4. VS Code (2019-2026): Used winpty initially, migrated to ConPTY, and in v1.90 (2026) dropped winpty entirely. VS Code also pioneered bundling conpty.dll for application- local updates independent of the OS.

  5. WezTerm, Alacritty, Warp: All modern GPU-accelerated terminal emulators use ConPTY exclusively on Windows.

The key distinction: terminal emulators use ConPTY to host other programs (standard use case). The self-hosting variant (application hosting itself) is used when a TUI application needs clean VT I/O but must also render its own UI to the real console. Neovim's --embed flag is the canonical example.

ConPTY engineering challenges

  • VT sequence pollution: ConPTY injects DECSET/DECRST, color resets, and title sequences into the output stream. A single keystroke can trigger 200+ bytes of state-management sequences. TUI apps that render their own output must tolerate this overhead.
  • Cursor inheritance deadlock: PSEUDOCONSOLE_INHERIT_CURSOR can cause permanent hangs during ClosePseudoConsole if the DSR response race is lost. WezTerm and VS Code disable this flag entirely.
  • Antivirus false positives: Headless conhost.exe spawning triggers heuristic AV/EDR detections (process hollowing indicators). Requires code signing and whitelisting.
  • conpty.dll decoupling: Microsoft now distributes conpty.dll separately from the OS. Apps can bundle it for fixes independent of Windows Update. VS Code exposes terminal.integrated.windowsUseConptyDll for this.

Plan: Drop ConPTY Self-Hosting, Fix Direct VT Input

What to do

Remove the ConPTY self-hosting parent shim and fix the direct VT input path instead. The ConPTY approach was built on incorrect assumptions about the root cause of mouse corruption. Neovim proves the direct ENABLE_VIRTUAL_TERMINAL_INPUT approach works.

Step 1: Fix wRepeatCount in vt_input.rs (critical)

For each KEY_EVENT with wRepeatCount > 1, emit the character byte N times. This matches libuv's behavior and prevents byte loss under event coalescing.

rust
// Current (broken):
if key_event.bKeyDown != 0 && ch != 0 {
    // ... encode ch once ...
}

// Fixed:
if key_event.bKeyDown != 0 && ch != 0 {
    let repeat = key_event.wRepeatCount.max(1) as usize;
    // ... encode ch ...
    for _ in 1..repeat {
        vt_bytes.extend_from_slice(&encoded_bytes);
    }
}

Step 2: Keep mode 1003 with corrupt sequence detection

Fresh uses mode 1003 (all-motion) for full hover/mousemove support. The higher event volume is mitigated by the dedicated reader thread (fast buffer drain) and strip_corrupt_mouse() which detects and discards sequences where the Windows console dropped the leading ESC byte.

Step 3: Remove ConPTY self-hosting code

Remove from fresh-winterm:

  • conpty_host.rs — the parent shim
  • mouse_sgr.rs — MOUSE_EVENT to SGR synthesis (only used by conpty_host)
  • Related integration tests
  • --conpty-child argument handling in editor startup

Keep fresh-winterm for:

  • vt_input.rs — console mode, ReadConsoleInputW, VtInputReader
  • relay.rs — client relay loop
  • terminal_size.rs — terminal size query

Step 4: Periodic console mode heartbeat

Re-assert ENABLE_VIRTUAL_TERMINAL_INPUT | ENABLE_WINDOW_INPUT every ~30 seconds. The ConPTY layer silently corrupts console mode flags after prolonged use (microsoft/terminal#19674). This is a defensive measure, not a fix for the mouse corruption.

Step 5: Verify bracketed paste works

With ConPTY removed, the terminal emulator's \x1b[200~ / \x1b[201~ markers flow directly through KEY_EVENT records to vt_input.rs and into the existing InputParser. No synthesis needed. Verify with manual testing.

Benefits

  • Bracketed paste works natively — terminal emulator handles it
  • Simpler architecture — single process, no parent/child relay
  • Mouse works reliably — with wRepeatCount fix, corrupt sequence detection, and dedicated reader thread
  • Cross-platform consistency — same VT byte stream as Unix
  • Less code — remove ~700 lines of ConPTY shim + SGR synthesis

Future improvements

  1. Mode 1002 fallback: config flag to switch to cell-motion mode for users experiencing issues under extreme event rates on older Windows builds.
  2. Runtime conhost version detection: if the OS conhost ships the PR #17738 fix, the corrupt sequence detection may become unnecessary.