skills/cache-expert/references/writingcoreapis.md
This is a practical guide for writing core, core/schema, and core/sdk
APIs in the current dagql cache model.
Almost everything here is covered in more depth elsewhere:
cachebasics.mdegraph.mdcache_persistence.mdcache_pruning.mdsession_resources.mdlazy_evaluation.mddynamicinputs.mddagqltypes.mdThe point of this doc is not to replace those. The point is to give you a workflow for the questions you usually need to answer when adding a new API.
Start from this assumption:
If that is true, dagql caching mostly just works.
The rest of this guide is about the cases where you need to opt into something more.
When writing a core API, these are the cache questions to walk through:
If you consciously answer those questions, you usually end up in a good place.
For most fields, the right answer is:
Only opt out or scope it differently when you have a real reason.
DoNotCacheIf a field is client-, session-, schema-, or call-scoped, usually what you want
is not DoNotCache, but one of:
WithInput(dagql.PerClientInput)WithInput(dagql.PerSessionInput)WithInput(dagql.PerCallInput)WithInput(dagql.CurrentSchemaInput)WithInput(dagql.RequestedCacheInput("noCache"))That keeps the field in the cache model while making the identity as specific as it needs to be.
DoNotCacheDoNotCache still exists, but it is a much narrower tool now than people often
assume.
Practical guidance:
DoNotCache fields are usually a bad fit in practice because
chaining and downstream caching get much harder to reason aboutDoNotCache on an object field, first ask whether you
really want scoped identity insteadThere are also hard implementation limits:
OnReleaserSo if the result needs cache-managed lifecycle or deferred materialization, it
is already a bad candidate for DoNotCache.
When a resolver creates a result with:
dagql.NewResultForCurrentCalldagql.NewResultForCalldagql.NewObjectResultForCurrentCalldagql.NewObjectResultForCallit is creating a detached result.
That is the normal starting point.
Detached results are good because:
You do not need to attach everything immediately.
Reach for cache.AttachResult(...) only when you genuinely need a cache-backed
result with a real result ID and normal lifecycle tracking.
Common reasons:
If you are just assembling an ordinary return value inside one resolver, detached is usually right.
The default identity is the structured ResultCall:
Often that is enough.
When it is not, these are the normal tools.
Use these when the call needs extra cache scoping or canonicalization.
Examples:
currentTypeDefsRead dynamicinputs.md if you need anything beyond the very simplest cases.
WithContentDigestUse WithContentDigest when the result has a stable semantic content identity
that should override or augment recipe identity.
This is extremely common and extremely useful.
Examples:
The usual question is:
"If two different recipes produce the same semantic thing, do I want the cache to know they are equivalent by content?"
If yes, this is probably the tool.
WithSessionResourceHandleMost APIs do not need to know about this.
This is only for building new session-resource-style objects, like secrets or sockets, where cache hits must become conditional on the caller having loaded the matching resource handle.
If you think you need this, stop and read session_resources.md.
If your object stores another cached object, store it as a result wrapper:
dagql.ObjectResult[*Foo]dagql.Result[*Bar]Do not casually unwrap everything to raw *Foo pointers if what you really
mean is "this object depends on that cache-backed object."
Why this matters:
This is especially important for graph-shaped metadata like typedefs, but it applies more broadly too.
This is one of the most important parts of writing correct APIs.
If your returned object refers to other results outside the normal call structure, the cache must be told about that.
Otherwise:
HasDependencyResultsThis is the normal mechanism.
Implement AttachDependencyResults on your object if it embeds child results
that should be normalized onto attached/cache-backed results before lifecycle
bookkeeping and persistence.
Typical examples:
DirectoryFileContainerModuleGitRepositoryRule of thumb:
AddExplicitDependencyUse cache.AddExplicitDependency(...) when you need an extra retained edge after
attachment that is not naturally represented by your object's fields.
This is rarer.
Examples today include some SDK generation paths retaining loaded/generated module results.
Lazy evaluation is for cases where returning the object shell immediately is cheap, but fully materializing it right away is expensive or unnecessary.
If you need it:
Lazy[...] implementationLazyStateLazyAccessor for fields that should not be read directly before
evaluationPractical warning:
cache.Evaluate(...) or go through the relevant LazyAccessorSee lazy_evaluation.md for the real shape.
This is the part people most often forget when writing more complex objects.
If the object owns snapshots or other cache-meaningful external state, you probably need some combination of:
OnReleaserPersistedSnapshotRefLinksCacheUsageIdentitiesCacheUsageSizeCacheUsageMayChangeOnReleaserImplement this when the object must release owned resources when the cache drops the result.
Very common for snapshot-backed objects.
PersistedSnapshotRefLinksImplement this when persisted results need to record which snapshots they own.
Without this, persistence can encode the object payload but still miss the snapshot linkage.
CacheUsageIdentities / CacheUsageSize / CacheUsageMayChangeImplement these when the object's snapshot usage should participate correctly in cache accounting and pruning.
These matter for:
Rule of thumb:
Good reference patterns:
DirectoryFileContainerHTTPStateRemoteGitMirrorClientFilesyncMirrorCacheVolumeThis is a schema-field decision first.
On the field spec, IsPersistable() means the result is eligible for persistent
cache retention across engine restarts.
The decision is usually:
Do not persist tiny cheap things just because you can.
If a persistable field returns an object, that object usually needs:
dagql.PersistedObjectdagql.PersistedObjectDecoderIn practice that means implementing:
EncodePersistedObjectDecodePersistedObjectThe basic pattern is usually JSON, with special handling where needed for:
If the object owns snapshots, persistence usually also needs
PersistedSnapshotRefLinks, as noted above.
Persistable cache is still cache persistence, not robust application-state storage.
So:
If you need to look at how something was called, you can inspect call metadata.
Useful entry points:
res.ResultCall()cache.ResultCallByResultID(...)cache.WalkResultCall(...)This is useful for:
Do not overuse it when simpler structured fields on your object would do, but it is there when you need it.
Most APIs should stop before this point.
These are specialist tools.
TeachCallEquivalentToResultUse this when you discovered after execution that a call is semantically equivalent to an existing result and you want to publish that equivalence into the cache/e-graph.
Good example: teaching that a no-op Directory.without(...) is equivalent to
its parent directory.
MakeResultUnpruneableUse this only when a result truly should live for the life of the engine.
The main example today is core typedef retention.
GetOrInitArbitraryUse this when you need in-memory cached arbitrary values that are not dagql DAG results at all.
This is not the normal path for graph objects.
This is the 80% case.
You have a deterministic field that returns an object and maybe wants a content digest.
Sketch:
func (s *thingSchema) thing(
ctx context.Context,
parent dagql.ObjectResult[*core.Query],
args thingArgs,
) (inst dagql.ObjectResult[*core.Thing], err error) {
srv, err := core.CurrentDagqlServer(ctx)
if err != nil {
return inst, err
}
obj := &core.Thing{
Name: args.Name,
}
inst, err = dagql.NewObjectResultForCurrentCall(ctx, srv, obj)
if err != nil {
return inst, err
}
dgst := hashutil.HashStrings(args.Name)
return inst.WithContentDigest(ctx, dgst)
}
What is happening here:
GetOrInitCallThat is the happy path.
This is the richer pattern.
Suppose your object owns a snapshot and is worth persisting.
You probably need:
dagql.NodeFunc(...).IsPersistable()OnReleaseEncodePersistedObjectDecodePersistedObjectPersistedSnapshotRefLinksCacheUsageIdentitiesCacheUsageSizeCacheUsageMayChange if appropriateHasDependencyResults if the object embeds child results tooIn practice, the best reference patterns are not synthetic examples, but real objects:
Directory / File
good for immutable snapshot-backed object patternsHTTPState
good for "mutable internal backing state, immutable outward-facing result"CacheVolume
good for "user-facing mutable snapshot object"ClientFilesyncMirror / RemoteGitMirror
good for "mutable backing object that powers separate immutable outputs"When in doubt, copy one of those shapes rather than inventing a new pattern.
DoNotCache.HasDependencyResults.When you need more detail, jump out from here like this:
cachebasics.mddynamicinputs.mdlazy_evaluation.mdcache_persistence.mdcache_pruning.mdsession_resources.mdegraph.mdWhen writing a core API, try to keep this sentence in your head:
"What is the semantic identity of this result, what other results/resources does it own, and what lifecycle hooks does the cache need in order to manage it correctly?"
If you answer that directly in the code, the rest usually falls into place.