docs/rfcs/remote-dyn-filter-rfc.md
This RFC proposes a remote dynamic filter propagation mechanism for distributed queries. It lets frontend-produced dynamic filters reach remote datanode scans through a lightweight control plane, while preserving one rule: remote dynamic filters are an optimization only, never a correctness dependency.
Today, dynamic filters can improve local execution, but they do not automatically propagate to remote datanode scans in distributed queries. As a result, the frontend may already know that a probe-side scan can be narrowed, while the remote scan still runs with a weaker predicate and loses pruning opportunities.
We want a minimal design that:
The high-level flow is:
MergeScanExec identifies the remote subscribers, generates a stable filter_id, and registers the alive filter into a query-scoped registry.wait_update or generation changes.The logical identity of a remote dynamic filter is query_id + filter_id.
Region and scan metadata are routing information, not part of filter-state identity. filter_id only needs to be stable and unique within a single query.
The current recommendation is to derive filter_id from:
region_idproducer-local ordinalcanonicalized children fingerprintThe following should not be included:
partitionThis design reuses the existing region unary gRPC path:
RegionRequest.body.remote_dyn_filterRemoteDynFilterRequest.oneof action
updateunregisterThe initial remote read is responsible for register and scan setup. The unary RPC path is only for later update and unregister messages.
The frontend uses a query-engine runtime map:
src/query/src/dist_plan/remote_dyn_filter_registry.rsquery_id -> Arc<RemoteDynFilterRegistry>This registry should not live on a single MergeScanExec, and it should not be stored in QueryContext.mutable_session_data. It is a query execution runtime object that owns watcher tasks, cleanup tail, and fanout state.
The registry lifecycle has three states:
Active: accepts registrations and sends updatesClosing: query ended; stop new registrations, send final cleanup messages, drain in-flight RPCsClosed: watchers stopped, state removable from the runtime mapThe registry may outlive the main query execution briefly for cleanup, but it is not intended to be a long-lived global object.
Remote dynamic filters should remain a selective optimization, not an automatic fanout for every filter update.
The frontend may skip remote propagation when the encoded filter becomes too large, fanout cost is too high, or the expected pruning benefit is too small. In those cases, execution should continue with local-only dynamic filtering semantics.
On the frontend:
MergeScanExec bridges producers to remote subscribers,On the datanode:
update and unregister,query_id + filter_id,All failures must degrade safely:
MergeScanExecRejected because lifecycle and cleanup would become fragmented across multiple bridge or exec instances in the same query.
QueryContext.mutable_session_dataRejected because this is the wrong ownership model. The registry is not session metadata; it is a query runtime object with watcher tasks and cleanup behavior.
Rejected for now because it is heavier than necessary. A query-engine runtime map is sufficient for the current design.
is_complete and final unregister?IN in later work?