roadmap/implementers-guide/src/node/backing/statement-distribution.md
This subsystem is responsible for distributing signed statements that we have generated and forwarding statements generated by our peers. Received candidate receipts and statements are passed to the Candidate Backing subsystem to handle producing local statements. On receiving StatementDistributionMessage::Share, this subsystem distributes the message across the network with redundency to ensure a fast backing process.
Goal: every well-connected node is aware of every next potential parachain block.
Validators can either:
Validators must have statements, candidates, and persisted validation from all other validators. This is because we need to store statements from validators who've checked the candidate on the relay chain, so we know who to hold accountable in case of disputes. Any validator can be selected as the next relay-chain block author, and this is not revealed in advance for security reasons. As a result, all validators must have a up to date view of all possible parachain candidates + backing statements that could be placed on-chain in the next block.
This blog post puts it another way: "Validators who aren't assigned to the parachain still listen for the attestations [statements] because whichever validator ends up being the author of the relay-chain block needs to bundle up attested parachain blocks for several parachains and place them into the relay-chain block."
Backing-group quorum (that is, enough backing group votes) must be reached before the block author will consider the candidate. Therefore, validators need to consider all seconded candidates within their own group, because that's what they're assigned to work on. Validators only need to consider backable candidates from other groups. This informs the design of the statement distribution protocol to have separate phases for in-group and out-group distribution, respectively called "cluster" and "grid" mode (see below).
Asynchronous backing changes the runtime to accept parachain candidates from a certain allowed range of historic relay-parents. These candidates must be backed by the group assigned to the parachain as-of their corresponding relay parents.
To address the concern of dealing with large numbers of spam candidates or
statements, the overall design approach is to combine a focused "clustering"
protocol for legitimate fresh candidates with a broad-distribution "grid"
protocol to quickly get backed candidates into the hands of many validators.
Validators do not eagerly send each other heavy CommittedCandidateReceipt,
but instead request these lazily through request/response protocols.
A high-level description of the protocol follows:
Nodes can send each other a few kinds of messages: Statement,
BackedCandidateManifest, BackedCandidateAcknowledgement.
Statement messages contain only a signed compact statement, without full
candidate info.BackedCandidateManifest messages advertise a description of a backed
candidate and stored statements.BackedCandidateAcknowledgement messages acknowledge that a backed candidate
is fully known.Nodes can request the full CommittedCandidateReceipt and
PersistedValidationData, along with statements, over a request/response
protocol. This is the AttestedCandidateRequest; the response is
AttestedCandidateResponse.
The prospective parachains subsystem maintains prospective "fragment trees" which can be used to determine whether a particular parachain candidate could possibly be included in the future. Candidates which either are within a fragment tree or would be part of a fragment tree if accepted are said to be in the "hypothetical frontier".
The statement-distribution subsystem keeps track of all candidates, and updates its knowledge of the hypothetical frontier based on events such as new relay parents, new confirmed candidates, and newly backed candidates.
We only consider statements as "importable" when the corresponding candidate is part of the hypothetical frontier, and only send "importable" statements to the backing subsystem itself.
Statement
messages for any candidates within that group and based on that relay-parent.Seconded statements must be sent before Valid statements.Seconded statements may only be sent to other members of the group when the
candidate is fully known by the local validator.
CommittedCandidateReceipt
and PersistedValidationData, which it receives on request from other
validators or from a collator.CompactStatement carrying nothing but a hash and signature) to the
cluster, is also a signal that the sending node is available to request the
candidate from.Seconded statements originating from a validator V
which each validator in a cluster may send to others. This bounds the number
of candidates.BackedCandidateManifest to their "receiving" nodes.BackedCandidateAcknowledgement.Statement messages directly to each other for any new statements they might
need.
ActiveLeaves
StatementDistributionMessage::Share
StatementDistributionMessage::Backed
StatementDistributionMessage::NetworkBridgeUpdate
Statement
BackedCandidateManifest
BackedCandidateKnown
NetworkBridgeTxMessage::SendValidationMessages
NetworkBridgeTxMessage::SendValidationMessage
NetworkBridgeTxMessage::ReportPeer
CandidateBackingMessage::Statement
ProspectiveParachainsMessage::GetHypotheticalFrontier
NetworkBridgeTxMessage::SendRequests
We also have a request/response protocol because validators do not eagerly send
each other heavy CommittedCandidateReceipt, but instead need to request these
lazily.
Requesting Validator
RequestManager::get_or_insert.
RequestManager::dispatch_requests sends any queued-up requests.
RequestManager::next_request to completion.
OutgoingRequest, saves the receiver in
RequestManager::pending_responses.Peer
IncomingRequestReceiver.
answer_request
through MuxedMessage.answer_request on the peer takes the request and sends a response.
Requesting Validator
receive_response on the original validator yields a response.
RequestManager::await_incoming to await on pending responses in an
unordered fashion.MuxedMessage receiver.handle_response handles the response.dispatch_requests
answer_request
AttestedCandidateRequest.receive_response
UnhandledResponsehandle_response
UnhandledResponseA manifest is a message about a known backed candidate, along with a description of the statements backing it. It can be one of two kinds:
Full: Contains information about the candidate and should be sent to peers
who may not have the candidate yet. This is also called an Announcement.Acknowledgement: Omits information implicit in the candidate, and should be
sent to peers which are guaranteed to have the candidate already.Manifest exchange is when a receiving node received a Full manifest and
replied with an Acknowledgement. It indicates that both nodes know the
candidate as valid and backed. This allows the nodes to send Statement
messages directly to each other for any new statements.
Why? This limits the amount of statements we'd have to deal with w.r.t. candidates that don't really exist. Limiting out-of-group statement distribution between peers to only candidates that both peers agree are backed and exist ensures we only have to store statements about real candidates.
In practice, manifest exchange means that one of three things have happened:
Concerning the last case, note that it is possible for two nodes to have each other in their sending set. Consider:
1 2
3 4
If validators 2 and 4 are in group B, then there is a path 2->1->3 and
4->3->1. Therefore, 1 and 3 might send each other manifests for the same
candidate at the same time, without having seen the other's yet. This also
counts as a manifest exchange, but is only allowed to occur in this way.
After the exchange is complete, we update pending statements. Pending statements are those we know locally that the remote node does not.
Nodes should send a BackedCandidateAcknowledgement(CandidateHash, StatementFilter) notification to any peer which has sent a manifest, and the
candidate has been acquired by other means. This keeps alternative paths through
the topology open, which allows nodes to receive additional statements that come
later, but not after the candidate has been posted on-chain.
This is mostly about the limitation that the runtime has no way for block authors to post statements that come after the parablock is posted on-chain and ensure those validators still get rewarded. Technically, we only need enough statements to back the candidate and the manifest + request will provide that. But more statements might come shortly afterwards, and we want those to end up on-chain as well to ensure all validators in the group are rewarded.
For clarity, here is the full timeline:
The cluster module provides direct distribution of unbacked candidates within a
group. By utilizing this initial phase of propagating only within
clusters/groups, we bound the number of Seconded messages per validator per
relay-parent, helping us prevent spam. Validators can try to circumvent this,
but they would only consume a few KB of memory and it is trivially slashable on
chain.
The cluster module determines whether to accept/reject messages from other validators in the same group. It keeps track of what we have sent to other validators in the group, and pending statements. For the full protocol, see "Protocol".
The grid module provides distribution of backed candidates and late statements outside the backing group. For the full protocol, see the "Protocol" section.
For distributing outside our cluster (aka backing group) we use a 2D grid topology. This limits the amount of peers we send messages to, and handles view updates.
The basic operation of the grid topology is that:
This grid approach defines 2 unique paths for every validator to reach every other validator in at most 2 hops, providing redundancy.
Propagation follows these rules:
For size 11, the matrix would be:
0 1 2
3 4 5
6 7 8
9 10
e.g. for index 10, the neighbors would be 1, 4, 7, 9 -- these are the nodes we could directly communicate with (e.g. either send to or receive from).
Now, which of these neighbors can 10 receive from? Recall that the sending/receiving sets for 10 would be different for different groups. Here are some hypothetical scenarios:
The seconding limit is a per-validator limit. Before asynchronous backing, we
had a rule that every validator was only allowed to second one candidate per
relay parent. With asynchronous backing, we have a 'maximum depth' which makes
it possible to second multiple candidates per relay parent. The seconding limit
is set to max depth + 1 to set an upper bound on candidates entering the
system.
The candidates module provides a tracker for all known candidates in the view, whether they are confirmed or not, and how peers have advertised the candidates. What is a confirmed candidate? It is a candidate for which we have the full receipt and the persisted validation data. This module gets confirmed candidates from two sources:
UnhandledResponse::validate_response), it will mark the
candidate as confirmed.The requests module provides a manager for pending requests for candidate data, as well as pending responses. See "Request/Response Protocol" for a high-level description of the flow. See module-docs for full details.
Acknowledgement and Announcement. See "Manifests" section.