doc/chain/chain_sync.md
Read this in other languages: Korean.
We describe here the different methods used by a new node when joining the network to catch up with the latest chain state. We start with reminding the reader of the following assumptions, which are all characteristics of Grin or Mimblewimble:
We're purposefully only focusing on major node types and high level algorithms that may impact the security model. Detailed heuristics that can provide some additional improvements (like header first), while useful, will not be mentioned in this section.
This model is the one used by "full nodes" on most major public blockchains. The new node has prior knowledge of the genesis block. It connects to other peers in the network and starts asking for blocks until it reaches the latest block known to its peers.
The security model here is similar to bitcoin. We're able to verify the whole chain, the total work, the validity of each block, their full content, etc. In addition, with Mimblewimble and full UTXO set commitments, even more integrity validation can be performed.
We do not try to do any space or bandwidth optimization in this mode (for example, once validated the range proofs could possibly be deleted). The point here is to provide history archival and allow later checks and verifications to be made.
Identical to other blockchains:
In this model we try to optimize for very fast syncing while sacrificing as little security assumptions as possible. As a matter of fact, the security model is almost identical as a full node, despite requiring orders of magnitude less data to download.
A new node is pre-configured with a horizon Z, which is a distance in number of
blocks from the head. For example, if horizon Z=5000 and the head is at height
H=23000, the block at horizon is the block at height h=18000 on the most
worked chain.
The new node also has prior knowledge of the genesis block. It connects to other
peers and learns about the head of the most worked chain. It asks for the block
header at the horizon block, requiring peer agreement. If consensus is not reached
at h = H - Z, the node gradually increases the horizon Z, moving h backward
until consensus is reached. Then it gets the full UTXO set at the horizon block.
With this information it can verify:
Once the validation is done, the peer can download and validate the blocks content from the horizon up to the head.
While this algorithm still works for very low values of Z (or in the extreme case
where Z=1), low values may be problematic due to the normal forking activity that
can occur on any blockchain. To prevent those problems and to increase the amount
of locally validated work, we recommend values of Z of at least a few days worth
of blocks, up to a few weeks.
While this sync mode is simple to describe, it may seem non-obvious how it still can be secure. We describe here some possible attacks, how they're defeated and other possible failure scenarios.
This range of attacks attempt to have a node believe it is properly synchronized with the network when it's actually is in a forged state. Multiple strategies can be attempted:
Our node downloaded the full UTXO set at horizon height. If a fork occurs on a block
at an older horizon H+delta, the UTXO set can't be validated. In this situation the
node has no choice but to put itself back in sync mode with a new horizon of
Z'=Z+delta.
Note that an alternate fork at Z+delta that has less work than our current head can safely be ignored, only a winning fork of total work greater than our head would. To do this resolution, every block header includes the total chain difficulty up to that block.
If a hard fork occurs, the network may become split, forcing new nodes to always push their horizon back to when the hard fork occurred. While this is not a problem for short-term hard forks, it may become an issue for long-term or permanent forks. To prevent this situation, peers should always be checked for hard fork related capabilities (a bitmask of features a peer exposes) on connection.
If a peer can't reach consensus on the header at h, it gradually moves back. In the degenerate case, rogue peers could force all new peers to always become full nodes (move back until genesis) by systematically preventing consensus and feeding fake headers.
While this is a valid issue, several mitigation strategies exist:
Z. This includes the
proof of work.