specifications/crypto/README.md
The consensus specification describes the cryptographic operations used for hashing, signing and verifying throughout the Diem Payment Network, as well as on transactions submitted to this network.
We do not specify any of one of these algorithms from first principles, but leverage and refer to existing specifications, while laying out exhastively where our implementation choices differ from the existant standards. Our implementation choices as described are expected to undergo future improvement through Diem Improvement Processes (DIPs).
Diem uses EdDSA with Curve25519 as its workhorse algorithm, both internally for the consensus and for signing transactions.
The Ed25519 algorithm is standardized by the Internet Engineering Task Force (IETF) in RFC-8032 and is about to be, in FIPS 186-5. The corresponding OID is 1.3.101.112.
Diem's implementation of EdDSA makes choices and slight variations from these standards in ways that we make precise below:
m taken as input to the signature algorithm
is replaced by a pre-hashed image PH(m), with PH a customizable hash
function. We do not use this legacy interface.R∥S, with R and S 32-byte
bytestrings, and ∥ the concatenation, we reject signatures for which the
scalar S is not within the interval [0 ... L-1], with L the order of the
prime order group generated by the curve's base point.<sup>1</sup> The
original EdDSA paper<sup>2</sup> and older Ed25519 libraries do not
always perform this check, which is necessary to avoid malleability.[s]B = R + [k]A, rather
than [8][s]B = [8]R + [8][k]A, where s, B, R, k, A are defined as
in RFC 8032 §5.1.7. We discuss this choice in [the next section](#On
cofactored vs. cofactor-less verification), and may revisit it through a DIP
in the future. Ed25519 implementations overwhelmingly employ cofactor-less
verification.Diem a uses cofactor-less verification equation. We note that cofactor-less verification is stricter than cofactored verification, and that a signer that follows either RFC 8032 or FIPS 186-5 will never produce a signature that passes cofactored verification yet fails cofactorless verification.
Nonetheless, there are good reasons to adopt cofactored verification:
Unfortunately, off-the-shelf EdDSA implementations overwhelmingly use cofactor-less verification. This presents a security concern specific to the blockchain context, as a third-party validator implementation may not be able to comply with a Diem specification that would mandate cofactored verification (if unable or unwilling to develop their own Ed25519 library).
If this otherwise honest third-party validator was then to run cofactor-less verification, it could disagree with the results of execution in a block that would contain a malicious non-conformant signature during speculative execution. If such a disagreement was to occur in more than a third of validators —either because they are byzantine or because they did not manage to implement cofactored verification—, it would be possible to stall consensus by spamming the transaction mempool with such non-conformant signatures<sup>5</sup>.
As a consequence, we require cofactorless verification, but expect a further DIP to revisit this decision as soon as the ecosystem of Ed255519 libraries makes the other choice practical.
Diem uses SHA3-256 as its go-to algorithm.
This is the sole algorithm used in internal hashes, meaning hashing where the input and output are for consumption by Diem code and data structures.
Diem also uses other hashing functions when conforming to an external specification, or calling into external libraries. In particular, we would like to highlight:
Noise_IK_25519_AESGCM_SHA256 key
exchange scheme, for the purpose of network communications. This entails the
use of a SHA2-256 hash function in the internal key derivation.These external uses notwithstanding, SHA3-256 is the only hashing function exposed to the Diem developer.
In Diem, we are signing large values (transactions, block proposals), which are therefore hashed as a standard prequel of signing. If the hashes for two distinct values somehow matched, this could have catastrophic consequences —a malicious actor could lead you to sign a weird transaction with one expected meaning, which in fact would produce a signature for something actually different.
How would this happen? We use cryptographically secure hash functions that are collision-resistant, so it's improbable that two distinct hashing inputs would produce the same hash. However, when we hash Diem values, we hash their serialized bytes. A serialization library is in general not collision resistant, so we have to add extra precautions for this purpose, something called domain separation.
We describe our serialization in the next section. For the moment, suffice to say that for every Diem data structure, we also define and specify a domain separator.
More specifically, the domain separator starts with a prefix defined as an ASCII-encoded byte-string:
"DIEM::"
This is concatenated with the
trace_name of the
data structure (also ASCII byte-string encoded) to form the domain separator
for this data structure type. This trace name is the serialization name as used
and documented in our serialization library.
Note that while the serialization name coincides with the name of the structure in Rust for many such types, it is overridable, and we expect such overrides to occur over time, at least for the purpose of versioning. Our serialization library includes tooling to extract the latest serialization specification, including the trace name, from code, and developers are advised to look for changes therein across Diem releases.
We will look at how the domain separator defined in this section is used [below](#Signing and Verifying).
Serialization of a payload is done using BCS. This serialization library guarantees deterministic serialization for any Diem data structure.
This library leverages the ubiquitous serde serialization and deserialization framework for Rust, and serde-reflection, a format description and polyglot code generator for serde.
<!-- TODO(fga): refer to serialization spec more fully. -->Signing a Diem data structure (including e.g. RawTransaction or
BlockData), or verifying a signature thereof, involves the following steps:
This byte representation is composed of two parts: the hash of the domain separator, concatenated with the serialized bytes of the data structure. The domain separator and hash are as defined above.
That is, if v is the Diem value we aim to sign, the bytes sent to the
signing or verification algorithm are:
SHA3-256( domain_separator(type_of(v)) ) ∥ bcs_bytes(v)
where ∥ is byte sequence concatenation. The purpose of using the hashed domain separator, rather than the separator itself is to avoid any suffix attacks on serialization bytes that would rely on the ambiguity on the byte offset that marks the end of the domain separator and the beginning of the structure. With our construction, the (hashed) domain separation information comprises the first 32 bytes of the binary representation exactly.
In the Diem code base, any structure that implements the derivable
CryptoHash trait and Serialize can be passed in message position to the
sign function implemented on private keys, and to the verify function
implemented on public keys and signatures. The construction of the binary
representation is performed implicitly.
When defining a new data structure in the Diem code base, the developer should
implement Serialize, and derive CryptoHasher and BCSCryptoHash by
annotation. The CryptoHash trait will be derived automatically.
Diem also supports a transaction signature format which implements multi-signatures, with signer accountability. This type of signature supports K out-of N semantics, for K ≤ N ≤ 32.
The implementation, exposed as a Move standard library module (Authenticator), is currently simply a multiple-signature container —rather than a distinct signing scheme (e.g. MuSig). Nonetheless, we plan to improve it in further DIPs, while expanding the scope of threshold signing.
We refer the reader to the Move standard library specification for the details of smart-contract semantics.
<!-- TODO(fga): refer to the Move stdlib doc more fully -->The input to the multi-public key smart contract consists of a MultiEd25519PublicKey,
which contains:
The multi-signature itself, named MultiEd25519Signature, consists of two components:
MultiEd25519PublicKey participated in signing.The verification of such a signature proceeds as follows:
MultiEd25519PublicKey and the signature
vector of MultiEd25519Signature have the same length, and reject the
signature otherwise,This scheme aims at enforcing the following principles:
<a name="f1">1</a>: The Provable Security of Ed25519: Theory and Practice, by Brendel et al.
<a name="f2">2</a>: High-speed, high-security signatures, Bernstein et al.
<a name="f3">3</a>: Accelerated Verification of ECDSA Signatures, Antipa et al.
<a name="f4">4</a>: Optimized Lattice Basis Reduction In Dimension 2, and Fast Schnorr and EdDSA Signature Verification, Pornin
<a name="f5">5</a>: More details in the consensus specification.