docs/internal/remote-filesystem-optimization.md
This document outlines strategies to improve memory efficiency and performance when handling large files over remote SSH connections.
The current implementation of RemoteFileSystem and the Python agent.py suffer from $O(FileSize)$ memory consumption during several key operations:
write and sudo_write commands require the entire file content to be sent as a single JSON message. This causes RAM spikes in both the Rust editor and the Python agent.RemoteFileWriter buffers all data locally in a Vec<u8> and only transmits it upon sync_all().open_file_for_append and set_file_length are implemented by downloading the entire file, modifying it in memory, and re-uploading it.PieceTree into a contiguous buffer, nullifying the benefits of lazy loading during the save process.Instead of replacing the entire file, the protocol should support structured updates.
Copy(src_offset, len) and Insert(data).Support multi-part uploads to avoid massive single allocations.
open_write_session(path) and write_chunk(session_id, data) commands.RemoteFileWriter flushes its internal buffer to the agent whenever a threshold (e.g., 64KB) is reached.Move logic that doesn't require editor intervention to the agent.
append(path, data) and truncate(path, length) to the Python agent.Directly integrate the PieceTree logic with the remote protocol.
BufferLocation::Stored: The agent reads directly from the local file on the remote disk.BufferLocation::Added: The editor sends the modified content.While the proposed optimizations focus on custom protocol extensions, several alternative architectures could achieve similar goals:
Rather than implementing a custom "Bake" or "Patch" logic, the system could utilize a block-level delta algorithm (similar to rsync).
PieceTree already tracks exact change locations.Adopt a Git-like approach where file content is managed as a collection of hashed blobs.
Utilize filesystem-level sparse file support or random-access writes if the protocol can be extended to support "Write at Offset".
| Approach | Implementation Complexity | RAM Efficiency | Network Efficiency |
|---|---|---|---|
| Chunked Streaming | Medium | High | Medium |
| Remote Bake | High | High | High |
| Delta Algorithm | High | High | High |
| Sparse Writes | Low | Medium | Medium |