docs/rfcs/2026-03-16-flow-inc-query.md
This RFC proposes a correctness-first incremental query mode for Flow batching.
Flow queries can read only seq > checkpoint and advance checkpoints using per-region correctness watermarks.
When incremental reads are stale or correctness cannot be proven, Flow falls back to full recomputation.
Flow batching still needs to repeatedly compute old data in the same time window, so incremental query can improve Flow performance.
seq > given_seq) for Flow.Introduce three QueryContext extension keys:
flow.incremental_after_seqsflow.incremental_modeflow.return_region_seqThese options are opt-in and only affect Flow incremental execution paths.
When incremental mode is enabled:
after_seq to memtable_min_sequence (exclusive lower bound)memtable_max_sequence)Important limitation in v1:
seq > checkpoint must not assume precise incremental pruning across memtable->SST flush boundariesIf required incremental parameters are missing or invalid, return argument error.
Add dedicated stale error:
IncrementalQueryStale { region_id, given_seq, min_readable_seq }Behavior:
given_seq < min_readable_seq, return stale errorgiven_seq == min_readable_seq, query is valid and reads seq > given_seqgiven_seq > min_readable_seq, query is also valid and reads seq > given_seqIncrementalQueryStale also covers the case where rows newer than the checkpoint have crossed a memtable->SST flush boundary and sequence-precise incremental exclusion can no longer be proven.
In other words, the flush-boundary case is not a separate fallback category in v1; it is one concrete way an incremental cursor becomes stale.
Extend query metrics with optional per-region watermark map:
region_latest_sequences: Vec<(region_id: u64, latest_sequence: u64)>Rules:
Checkpoint and watermark state are kept only in flownode memory in v1; they are not persisted as durable flow metadata. Cold start or flownode restart therefore always re-enters through a full snapshot read. Only after that full query succeeds with a complete correctness watermark may Flow switch back to incremental mode.
Flow starts in full mode, then transitions:
stateDiagram-v2
[*] --> FullSnapshot: Flow starts
state FullSnapshot {
[*] --> RunFull
RunFull --> RunFull: Full query succeeds but watermark is unprovable
no region_latest_sequences returned
}
FullSnapshot --> Incremental: Full query succeeds and correctness watermark is returned
(checkpoint updated)
state Incremental {
[*] --> RunInc
RunInc --> RunInc: Incremental succeeds
(checkpoint advances)
}
Incremental --> FullSnapshot: IncrementalQueryStale
(cursor too old, fallback required)
Incremental --> FullSnapshot: Incremental fails
and fallback policy is triggered
FullSnapshot --> [*]: Flow stops
Incremental --> [*]: Flow stops
Fallback to full mode is deterministic and is triggered by any of the following:
IncrementalQueryStale is returned.Policy behavior:
The v1 design is intentionally correctness-first and keeps the progress cursor lightweight:
seq > checkpoint alone to prove precise incremental exclusion, because SST lacks detailed row-level sequence metadata.snapshot_seqs transport and flow.* options must both be carried correctly.
snapshot_seqs means the per-region snapshot upper-bound map: region_id -> sequence.< vs <=) may cause correctness bugs.This plan enables a practical, correctness-first incremental path for Flow batching. It reuses existing sequence scan capability, adds strict stale handling, and advances checkpoints only from correctness-proven per-region watermarks.