src/secp256k1/doc/ellswift.md
In this document we explain how the ellswift module implementation is related to the
construction in the
"SwiftEC: Shallue–van de Woestijne Indifferentiable Function To Elliptic Curves"
paper by Jorge Chávez-Saab, Francisco Rodríguez-Henríquez, and Mehdi Tibouchi.
The ellswift module effectively introduces a new 64-byte public key format, with the property
that (uniformly random) public keys can be encoded as 64-byte arrays which are computationally
indistinguishable from uniform byte arrays. The module provides functions to convert public keys
from and to this format, as well as convenience functions for key generation and ECDH that operate
directly on ellswift-encoded keys.
The encoding consists of the concatenation of two (32-byte big endian) encoded field elements $u$ and $t.$ Together they encode an x-coordinate on the curve $x$, or (see further) a full point $(x, y)$ on the curve.
Decoding consists of decoding the field elements $u$ and $t$ (values above the field size $p$ are taken modulo $p$), and then evaluating $F_u(t)$, which for every $u$ and $t$ results in a valid x-coordinate on the curve. The functions $F_u$ will be defined in Section 2.
Encoding a given $x$ coordinate is conceptually done as follows:
This is the ElligatorSwift algorithm, here given for just x-coordinates. An extension to full $(x, y)$ points will be given in Section 4. The algorithm finds a uniformly random $(u, t)$ among (almost all) those for which $F_u(t) = x.$ Section 3.2 in the paper proves that the number of such encodings for almost all x-coordinates on the curve (all but at most 39) is close to two times the field size (specifically, it lies in the range $2q \pm (22\sqrt{q} + O(1))$, where $q$ is the size of the field).
First some definitions:
secp256k1, $q = 2^{256} - 2^{32} - 977$, which satisfies that requirement.secp256k1, $a=0$ and $b=7.$Note: In the paper:
Note that for $V$, the left hand side of the equation $z^2$ is square, and thus the right hand must also be square. As multiplying non-squares results in a square in $\mathbb{F}$, out of the three right-hand side factors an even number must be non-squares. This implies that exactly 1 or exactly 3 out of $\{g(x_1), g(x_2), g(x_3)\}$ must be square, and thus that for any $(x_1,x_2,x_3,z) \in V$, at least one of $\{x_1, x_2, x_3\}$ must be a valid x-coordinate on $E.$ There is one exception to this, namely when $z=0$, but even then one of the three values is a valid x-coordinate.
Define the decoding function $F_u(t)$ as:
$P_u(t) = (X(u, t), Y(u, t))$, where:
$$ \begin{array}{lcl} X(u, t) & = & \left\{\begin{array}{ll} \dfrac{g(u) - t^2}{2t} & a = 0 \ \dfrac{g(u) + h(u)(Y_0(u) - X_0(u)t)^2}{X_0(u)(1 + h(u)t^2)} & a \neq 0 \end{array}\right. \ Y(u, t) & = & \left\{\begin{array}{ll} \dfrac{X(u, t) + t}{u \sqrt{-3}} = \dfrac{g(u) + t^2}{2tu\sqrt{-3}} & a = 0 \ Y_0(u) + t(X(u, t) - X_0(u)) & a \neq 0 \end{array}\right. \end{array} $$
$P_u(t)$ is defined:
The functions $X_0(u)$ and $Y_0(u)$ are defined in Appendix A of the paper, and depend on various properties of $E.$
The function $\psi_u$ is the same for all curves: $\psi_u(X, Y) = (x_1, x_2, x_3, z)$, where:
$$ \begin{array}{lcl} x_1 & = & \dfrac{X}{2Y} - \dfrac{u}{2} && \ x_2 & = & -\dfrac{X}{2Y} - \dfrac{u}{2} && \ x_3 & = & u + 4Y^2 && \ z & = & \dfrac{g(x_3)}{2Y}(u^2 + ux_1 + x_1^2 + a) = \dfrac{-g(u)g(x_3)}{8Y^3} \end{array} $$
secp256k1Put together and specialized for $a=0$ curves, decoding $(u, t)$ to an x-coordinate is:
Define $F_u(t)$ as:
To make sure that every input decodes to a valid x-coordinate, we remap the inputs in case $P_u$ is not defined (when $u=0$, $t=0$, or $g(u) = -t^2$):
Define $F_u(t)$ as:
The choices here are not strictly necessary. Just returning a fixed constant in any of the undefined cases would suffice, but the approach here is simple enough and gives fairly uniform output even in these cases.
Note: in the paper these conditions result in $\infty$ as output, due to the use of projective coordinates there. We wish to avoid the need for callers to deal with this special case.
This is implemented in secp256k1_ellswift_xswiftec_frac_var (which decodes to an x-coordinate represented as a fraction), and
in secp256k1_ellswift_xswiftec_var (which outputs the actual x-coordinate).
To implement $F_u^{-1}(x)$, the function to find the set of inverses $t$ for which $F_u(t) = x$, we have to reverse the process:
The function $P_u^{-1}$, which finds $t$ given $(X, Y) \in S_u$, is significantly simpler than $P_u:$
$$ P_u^{-1}(X, Y) = \left\{\begin{array}{ll} Yu\sqrt{-3} - X & a = 0 \ \dfrac{Y-Y_0(u)}{X-X_0(u)} & a \neq 0 \land X \neq X_0(u) \ \dfrac{-X_0(u)}{h(u)Y_0(u)} & a \neq 0 \land X = X_0(u) \land Y = Y_0(u) \end{array}\right. $$
The third step above, verifying that $F_u(t) = x$, is necessary because for the $(X, Y)$ values found through the $x_1$ and $x_2$ expressions, it is possible that decoding through $\psi_u(X, Y)$ yields a valid $x_3$ on the curve, which would take precedence over the $x_1$ or $x_2$ decoding. These $(X, Y)$ solutions must be rejected.
Since we know that exactly one or exactly three out of $\{x_1, x_2, x_3\}$ are valid x-coordinates for any $t$, the case where either $x_1$ or $x_2$ is valid and in addition also $x_3$ is valid must mean that all three are valid. This means that instead of checking whether $x_3$ is on the curve, it is also possible to check whether the other one out of $x_1$ and $x_2$ is on the curve. This is significantly simpler, as it turns out.
Observe that $\psi_u$ guarantees that $x_1 + x_2 = -u.$ So given either $x = x_1$ or $x = x_2$, the other one of the two can be computed as $-u - x.$ Thus, when encoding $x$ through the $x_1$ or $x_2$ expressions, one can simply check whether $g(-u-x)$ is a square, and if so, not include the corresponding $t$ values in the returned set. As this does not need $X$, $Y$, or $t$, this condition can be determined before those values are computed.
It is not possible that an encoding found through the $x_1$ expression decodes to a different valid x-coordinate using $x_2$ (which would take precedence), for the same reason: if both $x_1$ and $x_2$ decodings were valid, $x_3$ would be valid as well, and thus take precedence over both. Because of this, the $g(-u-x)$ being square test for $x_1$ and $x_2$ is the only test necessary to guarantee the found $t$ values round-trip back to the input $x$ correctly. This is the reason for choosing the $(x_3, x_2, x_1)$ precedence order in the decoder; any order which does not place $x_3$ first requires more complicated round-trip checks in the encoder.
Before working out the formulas for all this, we switch to different variables for $S_u.$ Let $v = (X/Y - u)/2$, and $w = 2Y.$ Or in the other direction, $X = w(u/2 + v)$ and $Y = w/2:$
$$ \begin{array}{lcl} x_1 & = & v \ x_2 & = & -u - v \ x_3 & = & u + w^2 \ z & = & \dfrac{g(x_3)}{w}(u^2 + uv + v^2 + a) = \dfrac{-g(u)g(x_3)}{w^3} \end{array} $$
We can now write the expressions for finding $(v, w)$ given $x$ explicitly, by solving each of the $\{x_1, x_2, x_3\}$ expressions for $v$ or $w$, and using the $S_u'$ equation to find the other variable:
The ElligatorSwift algorithm as stated in Section 1 requires the computation of $L = F_u^{-1}(x)$ (the set of all $t$ such that $(u, t)$ decode to $x$) in full. This is unnecessary.
Observe that the procedure of restarting with probability $(1 - \frac{\#L}{8})$ and otherwise returning a uniformly random element from $L$ is actually equivalent to always padding $L$ with $\bot$ values up to length 8, picking a uniformly random element from that, restarting whenever $\bot$ is picked:
Define ElligatorSwift(x) as:
Now notice that the order of elements in $T$ does not matter, as all we do is pick a uniformly random element in it, so we do not need to have all $\bot$ values at the end. As we have 8 distinct formulas for finding $(v, w)$ (taking the variants due to $\pm$ into account), we can associate every index in $T$ with exactly one of those formulas, making sure that:
The last condition above only occurs with negligible probability for cryptographically-sized curves, but is interesting to take into account as it allows exhaustive testing in small groups. See Section 3.4 for an analysis of all the negligible cases.
If we define $T = (G_{0,u}(x), G_{1,u}(x), \ldots, G_{7,u}(x))$, with each $G_{i,u}$ matching one of the formulas, the loop can be simplified to only compute one of the inverses instead of all of them:
Define ElligatorSwift(x) as:
This is implemented in secp256k1_ellswift_xelligatorswift_var.
To implement $G_{c,u}$, we map $c=0$ to the $x_1$ formula, $c=1$ to the $x_2$ formula, and $c=2$ and $c=3$ to the $x_3$ formula. Those are then repeated as $c=4$ through $c=7$ for the other sign of $w$ (noting that in each formula, $w$ is a square root of some expression). Ignoring the negligible cases, we get:
Define $G_{c,u}(x)$ as:
Whenever a square root of a non-square is taken, $\bot$ is returned; for both square roots this happens with roughly 50% on random inputs. Similarly, when a division by 0 would occur, $\bot$ is returned as well; this will only happen with negligible probability. A division by 0 in the first branch in fact cannot occur at all, because $u^2 + uv + v^2 + a = 0$ implies $g(-u-x) = g(x)$ which would mean the $g(-u-x)$ is square condition has triggered and $\bot$ would have been returned already.
Note: In the paper, the $case$ variable corresponds roughly to the $c$ above, but only takes on 4 possible values (1 to 4). The conditional negation of $w$ at the end is done randomly, which is equivalent, but makes testing harder. We choose to have the $G_{c,u}$ be deterministic, and capture all choices in $c.$
Now observe that the $c \in \{1, 5\}$ and $c \in \{3, 7\}$ conditions effectively perform the same $v \rightarrow -u-v$ transformation. Furthermore, that transformation has no effect on $s$ in the first branch as $u^2 + ux + x^2 + a = u^2 + u(-u-x) + (-u-x)^2 + a.$ Thus we can extract it out and move it down:
Define $G_{c,u}(x)$ as:
This shows there will always be exactly 0, 4, or 8 $t$ values for a given $(u, x)$ input. There can be 0, 1, or 2 $(v, w)$ pairs before invoking $P_u^{'-1}$, and each results in 4 distinct $t$ values.
As mentioned before there are a few cases to deal with which only happen in a negligibly small subset of inputs. For cryptographically sized fields, if only random inputs are going to be considered, it is unnecessary to deal with these. Still, for completeness we analyse them here. They generally fall into two categories: cases in which the encoder would produce $t$ values that do not decode back to $x$ (or at least cannot guarantee that they do), and cases in which the encoder might produce the same $t$ value for multiple $c$ inputs (thereby biasing that encoding):
Define a version of $G_{c,u}(x)$ which deals with all these cases:
Given any $u$, using this algorithm over all $x$ and $c$ values, every $t$ value will be reached exactly once, for an $x$ for which $F_u(t) = x$ holds, except for these cases that will not be reached:
These cases form a negligible subset of all $(u, t)$ for cryptographically sized curves.
secp256k1Specialized for odd-ordered $a=0$ curves:
Define $G_{c,u}(x)$ as:
This is implemented in secp256k1_ellswift_xswiftec_inv_var.
And the x-only ElligatorSwift encoding algorithm is still:
Define ElligatorSwift(x) as:
Note that this logic does not take the remapped $u=0$, $t=0$, and $g(u) = -t^2$ cases into account; it just avoids them. While it is not impossible to make the encoder target them, this would increase the maximum number of $t$ values for a given $(u, x)$ combination beyond 8, and thereby slow down the ElligatorSwift loop proportionally, for a negligible gain in uniformity.
So far we have only addressed encoding and decoding x-coordinates, but in some cases an encoding for full points with $(x, y)$ coordinates is desirable. It is possible to encode this information in $t$ as well.
Note that for any $(X, Y) \in S_u$, $(\pm X, \pm Y)$ are all on $S_u.$ Moreover, all of these are mapped to the same x-coordinate. Negating $X$ or negating $Y$ just results in $x_1$ and $x_2$ being swapped, and does not affect $x_3.$ This will not change the outcome x-coordinate as the order of $x_1$ and $x_2$ only matters if both were to be valid, and in that case $x_3$ would be used instead.
Still, these four $(X, Y)$ combinations all correspond to distinct $t$ values, so we can encode the sign of the y-coordinate in the sign of $X$ or the sign of $Y.$ They correspond to the four distinct $P_u^{'-1}$ calls in the definition of $G_{u,c}.$
Note: In the paper, the sign of the y coordinate is encoded in a separately-coded bit.
To encode the sign of $y$ in the sign of $Y:$
Define Decode(u, t) for full $(x, y)$ as:
And encoding would be done using a $G_{c,u}(x, y)$ function defined as:
Define $G_{c,u}(x, y)$ as:
Note that $c$ now only ranges $[0,4)$, as the sign of $w'$ is decided based on that of $y$, rather than on $c.$ This change makes some valid encodings unreachable: when $y = 0$ and $sign(Y) \neq sign(0)$.
In the above logic, $sign$ can be implemented in several ways, such as parity of the integer representation of the input field element (for prime-sized fields) or the quadratic residuosity (for fields where $-1$ is not square). The choice does not matter, as long as it only takes on two possible values, and for $x \neq 0$ it holds that $sign(x) \neq sign(-x)$.
secp256k1For $a=0$ curves, there is another option. Note that for those, the $P_u(t)$ function translates negations of $t$ to negations of (both) $X$ and $Y.$ Thus, we can use $sign(t)$ to encode the y-coordinate directly. Combined with the earlier remapping to guarantee all inputs land on the curve, we get as decoder:
Define Decode(u, t) as:
This is implemented in secp256k1_ellswift_swiftec_var. The used $sign(x)$ function is the parity of $x$ when represented as in integer in $[0,q).$
The corresponding encoder would invoke the x-only one, but negating the output $t$ if $sign(t) \neq sign(y).$
This is implemented in secp256k1_ellswift_elligatorswift_var.
Note that this is only intended for encoding points where both the x-coordinate and y-coordinate are unpredictable. When encoding x-only points where the y-coordinate is implicitly even (or implicitly square, or implicitly in $[0,q/2]$), the encoder in Section 3.5 must be used, or a bias is reintroduced that undoes all the benefit of using ElligatorSwift in the first place.