docs/internal/project-search-replace.md
Status: Feature-complete — needs tests and polish Date: March 2026
The original search_replace plugin shelled out to git grep for search and used raw std::fs for replacement. This bypassed the FileSystem trait (broken on SSH), bypassed the buffer model (no undo, stale results, encoding mangling), and had no large-file support.
The editor already has all the machinery to solve these problems: piece tree chunked search, lazy loading through the FileSystem trait, and incremental non-blocking scanning.
FileSystem trait for I/O, TextBuffer for search/edit, plugin API for UI.FileSystem.TextBuffer code path handles small and large files. load_large_file creates lazy piece tree nodes; get_text_range_mut loads chunks on demand; chunks are searched and can be discarded.Plugin: editor.grepProjectStreaming(pattern, opts, progressCallback)
→ Rust: snapshot dirty buffers on main thread
→ Rust: spawn tokio task
→ Walker: ignore::WalkBuilder respects .gitignore
→ Per file (8 parallel via semaphore):
- If dirty snapshot exists → wrap in TextBuffer, search_scan_all()
- Else → fs.read_file(), wrap in TextBuffer, search_scan_all()
- SearchMatch results (byte_offset, length, line, column, context)
→ convert to GrepMatch JSON → send via AsyncBridge
→ Plugin receives streaming progress callbacks + final resolution
Plugin: editor.replaceInFile(filePath, matches, replacement)
→ Rust: open file as buffer if needed (hidden_from_tabs)
→ Sort matches descending by byte_offset
→ Apply all edits as single bulk operation (single undo)
→ Save via FileSystem trait
All search — built-in single-buffer, synchronous project grep, and streaming project grep — now goes through the same code path:
SearchMatch struct — byte_offset, length, line, column, contextChunkedSearchState — mutable scan state with incremental running_line countersearch_scan_init(regex, max_matches, query_len) — creates state from prepare_line_scan()search_scan_next_chunk(&mut state) — processes one chunk, computes line/col/context on the fly. O(chunk_size) line counting via incremental cursor.search_scan_all(regex, max_matches, query_len) — synchronous variant for spawn_blockingEditor's SearchScanState wraps ChunkedSearchState + editor-specific fields. process_search_scan_batch delegates to buffer.search_scan_next_chunk().
handle_grep_project and handle_grep_project_streaming create temporary TextBuffer instances and call search_scan_all(). The old collect_matches_from_bytes (with its duplicated regex/case-folding/whole-word logic) is deleted. Whole-word matching uses \b...\b in the regex, same as built-in search.
search_replace.ts)editor.delay())grepProjectStreaming progress callbackreplaceInFile — groups by file, applies edits, saveseditor.t() (~140 messages in search_replace.i18n.json)quickjs_backend.rs)grepProjectStreaming with custom JS wrapper, auto-generated d.ts via ts_raw proc macro attributereplaceInFile returns ReplaceResult { replacements, buffer_id }GrepMatch type with file, buffer_id, byte_offset, length, line, column, contextThe overlap window between chunks is max(query_len, 256) bytes. If a match sits on a line longer than the overlap, the reported column will be relative to the overlap start (not the true line start) and context will be truncated. This affects < 1% of real code. Increasing the overlap would increase redundant scanning.
The 8 parallel searchers check match_count with relaxed atomics, so the total can slightly exceed max_results. This is acceptable — it's a UI responsiveness limit, not a hard contract.
\b word boundaries are ASCII-centricRust's \b matches ASCII word boundaries. For non-ASCII identifiers (e.g., café), whole-word matching may miss boundaries. Same limitation as built-in search.
Unit tests for search_scan_next_chunk:
Unit tests for project grep handlers:
handle_grep_project with open dirty buffers vs unopened filesE2E test gaps:
editor.debug() calls (search_replace_enter, search_replace_tab)load_large_file for files above a project-grep-specific threshold)