eden/docs/Engineering/CASC/content_addressed_source_control.md
CASC stands for Content-Addressed-Source-Control, the project aiming to utilise CAS in Source Control to solve a class of problems, initially for the fbsource megarepo.
Sapling Cache Issues:
Caches Hierarchy and Engineering Time:
EdenFs Kernel caches => EdenFs In-Memory caches => EdenFS RocksDB caches => Sapling Cache (hgcache) => Mononoke In-Memory cache => Memcache => Hedwig => Manifold
Integrating with CAS enables us to shift our focus towards other Source Control challenges.
Mononoke Overloads:
eden prefetch has been gaining popularity as a solution to mitigate the issue of accumulating remote fetch latencies resulting from sequential fuse fetches, that causes poor performance for the tools, especially hack-based, and longer TTS (time to signal) for user's DIFFs.
Any spike in the amount of well-batched traffic, coming typically from prefetching, is a common cause of Mononoke SEVs.
Furthermore, the ongoing Crawling Prediction Project is expected to lead to an increase in unbatched traffic.Eden Light:
By utilizing CASC, the memory footprint of individual EdenFS daemons on SCM on RE is significantly reduced, thereby enabling the allocation of multiple workers on a single host and ultimately enhancing platform efficiency.
The Local Cache Hit Rate is anticipated to surpass that of the Sapling Cache on OnDemand and Sandcastle, resulting in lower end-to-end latency for EdenFS.
To learn more about the caching flow for CASC and prior to CASC, please visit the EdenFS Caching Flow page.
Why Persistent Caches on On Demand are important?
The lifetime of a repository on On Demand is comprised of the preparation cycle and user session, which cannot exceed 18 hours. This duration also applies to the Sapling Cache COW mount data lifetime, EdenFS daemon lifetime and the repo checkout lifetime. In the absence of prefetched Sapling Cache and with the use of resource-intensive tools like meerkat, it implies that most of the repository's data (such as www) is refetched at least daily on every host from scratch. Consequently, this would result in an unsustainable load on Mononoke, our Source Control backend.
Persistent Caches, of CASC, would allow to deprecate the expensive full (www) repo prefetching and significantly simplify repo cloning mechanisms for Developer Environments.
Why Persistent Caches on On Sandcastle are important?
Local caching in Sandcastle reduces the load on backend services by handling a significant amount of traffic, thereby enhancing job durations. However, in Sandcastle, the hgcache is read/write mounted by the TW agent and subsequently read/write mounted into a job container. This setup effectively grants write access to the hg cache on the physical host for every job, and the cache is not cleaned up for the entire lifespan of the host. The Sapling Cache lacks the necessary integrity checks to prevent potential corruption by malicious users, who could alter the cache to modify generated builds.
Unlike the Sapling Cache, the CASd local cache performs hash validation checks and restricts write access exclusively to the cas-daemon process on a physical host. However, it allows direct read-only access to all clients of the cache. Implemented the LRU (Least Recently Used) eviction strategy also ensures that unused blobs are removed in a timely manner.
As worker types tend to be sticky and affinities are being improved in the future, the persistent local caches are expected to serve the majority of EdenFS fetches required for build operations (buck/meerkat).