skills/cache-expert/references/mutablecache.md
This doc covers a small set of specialized cache-backed objects whose internal state is meaningfully mutable:
HTTPStateRemoteGitMirrorClientFilesyncMirrorCacheVolumeThese are all a little different, but they share one important theme:
If these mutable snapshots lived somewhere ambiently outside the dagql cache model, they could consume disk space and mutate over time without the cache knowing they existed. By keeping them attached to real cache objects, we can:
All four of these objects follow the same broad shape:
OnReleasePersistedSnapshotRefLinks, CacheUsageIdentities, CacheUsageSize, and CacheUsageMayChangeThe most important difference between them is what they expose to callers:
HTTPState is the most controlled version of this pattern.
HTTPState stores:
The snapshot is stored as an immutable ref plus a persisted snapshot ID for reload.
The public Query.http API has two paths:
HTTPState and does a direct fetch with FetchHTTPFile._httpState object and then _resolve.So the mutable-backed state object is specifically the normal unauthenticated/no-service-host path.
HTTPState.Resolve does conditional requests:
If-None-Match when it has an ETagIf-Modified-Since when it has a Last-Modified valueThen:
304 Not Modified, it keeps the existing canonical snapshot and just reuses it200 OK, it downloads into a new canonical snapshot and compares the resulting content digestIf the content digest changed, it replaces the stored snapshot. If the digest did not change, it releases the new download snapshot and keeps the existing one, while still updating validators like ETag and Last-Modified.
That is the mildly clever part: content identity, not just validator churn, decides whether the owned snapshot actually changes.
The canonical internal snapshot always stores the downloaded payload at a fixed path, contents, with canonical permissions.
When a caller resolves it to a File, HTTPState.fileResult:
contents to the requested filenameSo the internal mutable-ish state stays hidden in HTTPState, while callers still get ordinary immutable File results they can build on top of.
This is a good pattern to keep in mind: mutable internal backing state, immutable outward-facing result.
RemoteGitMirror is the backing store for remote git fetches.
It owns a mutable snapshot containing a bare git repository for a specific remote URL.
That snapshot is:
The schema creates or loads one through the internal _remoteGitMirror query field, and RemoteGitRepository stores it as a dependency.
When git needs to operate on the remote, it does not fetch directly into a throwaway checkout. Instead:
So the mirror itself is mutated in place over time as fetches happen.
Callers do not get the mutable bare mirror snapshot itself.
For a specific RemoteGitRef.Tree(...), the flow is:
So the mirror is mutable backing state for fetch acceleration and reuse, but the returned Directory snapshots are still immutable point-in-time checkouts.
That split is the key thing to understand.
Remote git also uses GetOrInitArbitrary to cache ls-remote metadata as plain JSON strings scoped by session plus auth configuration.
That is related operationally, but it is not the mutable-snapshot part. The mutable-snapshot part is specifically the bare mirror object.
ClientFilesyncMirror is the mutable backing store for host file imports.
It stores:
filesync.MirrorSharedStateThe mutable snapshot is persistable for stable clients.
Host imports use two modes:
_clientFilesyncMirror keyed by stable client ID plus driveSo for stable clients, the mutable mirror survives beyond a single import and can be reused across reconnects.
EnsureCreated only ensures the mutable snapshot exists.
The mounted runtime state is created lazily in ensureRuntimeLocked when a sync actually needs it. acquire reference-counts active users, and when the last user releases, the runtime mount is torn down again.
That means:
The mirror snapshot is not returned directly to the caller.
Instead, ClientFilesyncMirror.Snapshot(...) hands its MirrorSharedState to engine/filesync.FileSyncer, which:
So again, the mutable object is backing state, while the outward-facing result is an ordinary immutable directory/file snapshot.
noCachehost.directory / host.file do not bypass the mirror for noCache.
Instead, they set a filesync cache-buster in SnapshotOpts. That forces a fresh snapshot result while still using the existing mutable mirror as the synchronization base.
CacheVolume is the most direct mutable object in this group.
It stores:
The subtle but important point here is that we do not try to have one underlying ambient mutable volume and then vary parameters like source/owner/sharing outside the cache identity.
Instead, those parameters all contribute to the cache object identity upstream:
cacheVolume(...) includes them directlyPRIVATE sharing injects a nonce so it becomes uniqueSo different parameter combinations become different cache-volume objects with different owned mutable snapshots.
That makes the behavior much easier to reason about.
InitializeSnapshot lazily creates the mutable ref when first needed.
If a source directory is configured, it:
If an owner is configured, it may first synthesize or chown a source directory before creating the mutable ref.
Unlike HTTP, remote git, and filesync mirrors, cache volumes are often consumed as the mutable thing itself.
Container exec paths call:
InitializeSnapshot if neededgetSnapshot()getSnapshotSelector()and mount that mutable ref directly into the container as a cache mount.
So cache volume is the case where the mutable backing object is not merely internal machinery. It is the actual user-facing semantic object.
There are a few important differences between these cases:
HTTPStateRemoteGitMirrorClientFilesyncMirrorCacheVolumeThis is the main design point of the whole doc:
these objects are mutable enough that they would be awkward or dangerous if they lived outside the dagql cache model, but they are still regular dagql cache objects, so:
That is the reason we accept the extra complexity of modeling them explicitly instead of hiding them as ambient engine state.