DynaFu ICP Math

Differentiating and Linearising Rt matrices

In dynafu, the warp function looks like the following for each node $i$:

$ \begin{equation*} f_i(x_i, V_g) = T_{x_i} * V_g = R(x_i) * V_g + t(x_i) \end{equation*} $

where ${x_i}$ are the transformation parameters for node $i$ and the rotation is performed around the corresponding node (and not a global reference)

For linearising a transform around the parameters $\mathbf{x}$, we need to find the derivative

$ \begin{equation*} \displaystyle \frac{\partial f_i(\mathbf{x} \circ \epsilon, V_g)}{\partial \epsilon} |_{\epsilon = 0} \end{equation*} $

We calculate this as follows:

$ \begin{equation*} f_i(\mathbf{x} \circ \epsilon, V_g) = f_i(\epsilon, V) = T_{inc} * V \end{equation*} $ where $V = f_i(\mathbf{x}, V_g)$ and $T_{inc}$ is the infinitesimal transform with parameters $\epsilon$

According to Lie algebra, each Rt matrix can be represented as $A = e^\xi$ where $\xi$ are the transform parameters. Therefore,

$ \begin{equation*} f_i(\mathbf{x}, V_g) = e^\xi V \end{equation*} $

Therefore,

$ \begin{equation*} \displaystyle \frac{\partial f_i(\mathbf{x} \circ \xi, V_g)}{\partial \xi} |{\xi = 0} = \frac{\partial e^\xi V}{\partial \xi} |{\xi=0} = \begin{pmatrix} -[V]{\times} & I{3x3} \end{pmatrix}_{3 \times 6} \end{equation*} $

Let us denote $\begin{pmatrix} -[V]{\times} & I{3x3} \end{pmatrix}$ as $G(V)$ from now on.

This result is mentioned in this manifold optimisation tutorial (equation 10.23).

With this result, we can now linearise our transformation around $\mathbf{x}$:

$ \begin{equation*} f_i(x_i, V_g) = G(V) * \epsilon + V \end{equation*} $

I suppose the following is an equivalent excerpt from the dynafu paper (Section about efficient optimisation) that mentions this way of calculating derivatives:

We formulate compositional updates $\hat x$ through the exponential map with a per-node twist $ξ_i ∈ se(3)$, requiring 6 variables per node transform, and perform linearisation around $ξ_i= 0$.

As a side note, the derivative $\large \frac{\partial e^\xi}{\partial \xi}|{\xi=0}$ is called the tangent (esentially the derivative) to the SE(3) manifold (the space in which Rt matrix $T{inc}$ exists) at identity ($\xi = 0$)

Estimating Warp Field Parameters

The total energy to be minimised is

$ E = E_{data} + \lambda E_{reg} $

Data term rearrangement

$ \displaystyle E_{data} = \sum_{u \in \Omega} \rho_{Tukey}( (T_u N_g)^T ((T_u V_g) - V_c)) $

The quadcopter paper tells us that the following expression has the same minimiser, so we can use this instead:

$ \displaystyle E_{data} = \sum_{u \in \Omega} w_{Tukey}(r_u) \cdot (r_u)^2 $

where $w_{Tukey}(x) = \rho'(x)/x$ which behaves like a constant term and $r_u = N_g^T (V_g - T_u^{-1}\cdot V_c)$

Regularisation term rearrangement

$ \begin{equation} \displaystyle E_{reg} = \sum_{i = 0}^n \sum_{j \in \varepsilon(i)} \alpha_{ij} \rho_{Huber} (T_{i}V_g^j - T_{j}V_g^j) \end{equation} $

This needs to be changed to the form of weighted least squares to be useful. So incorporate the same rearrangement as the data term and sum over edges instead:

$ \begin{equation} \displaystyle E_{reg} = \sum_{e \in E} w_{Huber}(r_e) (r_e)^2 \end{equation} $

Here $E$ is the set of the directed edges in the regularisation graph between all nodes from current level and the next coarser level. And $w_{Huber}(x) = \alpha_x \rho'(x)/x$

Obtaining normal equation

Therefore to solve an iteration, we equate the derivative with 0

$ \begin{equation*} \large \frac{\partial E_{data}}{\partial \xi} + \lambda \frac{\partial E_{reg}}{\partial \xi} = 0 \end{equation*} $

which gives us

$ \begin{equation*} J_d^T W_d(r_d + J_d\mathbf{\hat x}) + \lambda J_r^T W_r (r_r + J_r\mathbf{\hat x}) = 0 \end{equation*} $

$ (J_d^T W_d J_d + \lambda J_r^T W_r J_r)\mathbf{\hat x} = -(J_d^T W_d r_d + \lambda J_r^T W_r r_r) $

Here $W_d$ and $W_r$ are the weight matrices as described in quadcopter paper. However for $W_r, \alpha$ is also incorporated in this matrix

Calculating Data Term Jacobian ($J_d$)

We estimate the inverse of Rt matrices instead of the Rt matrices themselves. So firstly we have to write $T_u^{-1} V_g$ in terms of inverse matrices. However, I realised that

$$ \begin{equation*} T_u^{-1} V_g \ne \sum_{k \in N(V_u)} \frac{w_k T_k^{-1}V_g}{w_k} \end{equation*} $$

Unfortunately, I could not find a representation of $T_u^{-1} V_g$ in terms of $T_k^{-1}V_g$ and got stuck here. Below is an approach without estimating the inverse Rt matrices, I think we can use that instead as the math is now fixed.

Alternative calculation for $J_d$

The residual $r_u$ in the data term is

$$ r_u = (T_u N_g)^T (T_u V_g - V_c) $$

Let $a, b$ be column vectors such that $a = T_u N_g$ and $b = (T_u V_g - V_c)$. Now we can state the residual as

$$ r_u = a^Tb $$

Each entry in $J_d$ for node paramter $x_j$ associated with node $j$ is:

$$ (J_d)_{uj} = \frac{\partial r_u}{\partial x_j} = \frac{\partial (a^Tb)}{\partial x_j} $$

Please note that numerator layout is assumed in all the derivatives

Applying chain rule for multiple variables, we get

$$ \begin{equation}\begin{aligned} \frac{\partial (a^Tb)}{\partial x_j} & = \frac{\partial (a^Tb)}{\partial a} \cdot \frac{\partial a}{\partial x_j} + \frac{\partial (a^Tb)}{\partial b} \cdot \frac{\partial b}{\partial x_j} \ & = \frac{\partial (a^Tb)}{\partial a} \cdot \frac{\partial a}{\partial x_j} + \frac{\partial (b^Ta)}{\partial b} \cdot \frac{\partial b}{\partial x_j} && \text{Since $a^Tb = b^Ta$} \ & = b^T \cdot \frac{\partial a}{\partial x_j} + a^T \cdot \frac{\partial b}{\partial x_j} && \text{Since $\frac{\partial x^TA}{\partial x} = A^T$} \end{aligned}\end{equation}\tag{1}\label{1} $$

The identity $\frac{\partial x^TA}{\partial x} = A^T$ is mentioned in this wikipedia page. Now we calculate $\frac{\partial a}{\partial x_j}$ and $\frac{\partial b}{\partial x_j}$ as follows:

$$ \begin{equation}\begin{aligned} \frac{\partial a}{\partial x_j} & = \frac{\partial (T_u N_g)}{\partial x_j} \ & = \sum_{k \in N(V_u)} \frac{w_k \frac{\partial T_k N_g}{\partial x_j}}{w_k} \ & = \begin{cases} \frac{w_j \frac{\partial T_j N_g}{\partial x_j}}{\sum_{k \in N(V_u)} w_k} && \text{if $j \in N(V_u)$} \ 0 && \text{otherwise} \end{cases} \ & = \begin{cases} \frac{w_j \begin{pmatrix}-[T_j N_g]\times & I{3\times3}\end{pmatrix}}{\sum_{k \in N(V_u)} w_k} && \text{if $j \in N(V_u)$} \ 0 && \text{otherwise} \end{cases} \end{aligned}\end{equation}\tag{2}\label{2} $$

$$ \begin{equation}\begin{aligned} \frac{\partial b}{\partial x_j} & = \frac{\partial (T_uV_g - V_c)}{\partial x_j} \ & = \frac{\partial T_uV_g}{\partial x_j}\ & = \begin{cases} \frac{w_j \begin{pmatrix}-[T_j V_g]\times & I{3\times3}\end{pmatrix}}{\sum_{k \in N(V_u)} w_k} && \text{if $j \in N(V_u)$} \ 0 && \text{otherwise} \end{cases} && \text{Calculated similarly to ($\ref{2}$)} \end{aligned}\end{equation}\tag{3}\label{3} $$

Substituting equations $(\ref{2})$, $(\ref{3})$ as well as the values of $a^T$ and $b^T$ in $(\ref{1})$, we obtain the required result:

$$ (J_d){uj} = \begin{cases} \frac{w_j}{\sum{k \in N(V_u)} w_k} \left( (T_u V_g - V_c)^T \begin{pmatrix}-[T_j N_g] & I_{3\times3}\end{pmatrix} + (T_u N_g)^T \begin{pmatrix}-[T_j V_g] & I_{3\times3}\end{pmatrix} \right) & \text{if} j \in N(V_u) \ 0 & \text{otherwise} \end{cases} $$

We can simplify this expression further:

$$ \begin{equation*} \left( (T_u V_g - V_c)^T \begin{pmatrix}-[T_j N_g] & I_{3\times3}\end{pmatrix} + (T_u N_g)^T \begin{pmatrix}-[T_j V_g] & I_{3\times3}\end{pmatrix} \right) = \ \left [ (T_u V_g - V_c)^T \left ( -[T_j N_g]\otimes \right ) \mid (T_u V_g - V_c)^T \right ] + \left [ (T_u N_g)^T \left ( -[T_j V_g]\otimes \right ) \mid (T_u N_g)^T \right ] = \ \left [ (T_u V_g - V_c)^T \left ( -[T_j N_g]\otimes \right ) + (T_u N_g)^T \left ( -[T_j V_g]\otimes \right ) \mid (T_u V_g + T_u N_g - V_c )^T \right ] = \ \left [ (T_u V_g - V_c)^T ( [T_j N_g]\otimes )^T + (T_u N_g)^T ( [T_j V_g]\otimes )^T \mid (T_u V_g + T_u N_g - V_c )^T \right ] = \ \begin{bmatrix} ( [T_j N_g]\otimes )(T_u V_g - V_c) + ( [T_j V_g]\otimes )(T_u N_g) \ T_u V_g + T_u N_g - V_c \end{bmatrix}^T = \ \begin{bmatrix} ( T_j N_g) \times (T_u V_g - V_c) + ( T_j V_g) \times (T_u N_g) \ T_u V_g + T_u N_g - V_c \end{bmatrix}^T = \ \begin{bmatrix} ( T_j N_g) \times (T_u V_g - V_c) + ( T_j V_g) \times (T_u N_g) \ T_u V_g + T_u N_g - V_c \end{bmatrix}^T \end{equation*} $$

So, we get the final expression as: $$ \begin{equation*} (J_d){uj} = \begin{cases} \frac{w_j}{\sum{k \in N(V_u)} w_k} \begin{bmatrix} ( T_j N_g) \times (T_u V_g - V_c) + ( T_j V_g) \times (T_u N_g) \ T_u V_g + T_u N_g - V_c \end{bmatrix}^T & \text{if} j \in N(V_u) \ 0 & \text{otherwise} \end{cases} \end{equation*} $$

Now that we have expression for the Jacobian, we can form the normal equation corresponding to the data term of the ICP

$$ \begin{equation*} \ \Omega \text{ pixels, N nodes} \ J_d^T W_d J_d \mathbf{\hat x} = -J_d^T W_d r_d \ \underbrace{J_d^T}{6N \times\Omega} \underbrace {W_d}{\Omega \times \Omega} \underbrace{J_d}{\Omega \times 6N} \underbrace{\mathbf{\hat x}}{6N \times 1} = -J_d^T W_d \underbrace{r_d}{\Omega \times 1} \ \underbrace{\mathbf{A}d}{6N \times 6N} \mathbf{\hat x} = \underbrace{\mathbf{b}d}{6N \times 1} \ \ \text {for each block (i, j) which are 6}\times\text{6:} \ (\mathbf{A}d){ij} = \sum{\Omega} \underbrace{w_d(\Omega)}{\text{robust for pix}} \left(\frac{\partial E_d}{\partial x_i}\right)^T\Omega \left(\frac{\partial E_d}{\partial x_j}\right)\Omega \ \ \text {for each block j which are 6}\times\text{1:} \ (\mathbf{b}d){j} = - \sum{\Omega} \underbrace{w_d(\Omega)}{\text{robust for pix}} r_d(\Omega) \left(\frac{\partial E_d}{\partial x_j}\right)^T\Omega \ (\mathbf{b}d){j} = - \sum_{\Omega} \underbrace{w_d(\Omega)}{\text{robust for pix}} ((T_u N_g)^T (T_u V_g - V_c)){\Omega} \left( \frac{w_j \text{ or 0} }{\sum_{k \in N(V_u)} w_k} \begin{bmatrix} ( T_j N_g) \times (T_u V_g - V_c) + ( T_j V_g) \times (T_u N_g) \ T_u V_g + T_u N_g - V_c \end{bmatrix} \right)\Omega \ (\mathbf{A}d){ij} = \sum{\Omega} \underbrace{w_d(\Omega)}{\text{robust for pix}} \left( \frac{(w_i\text{ or 0})(w_j\text{ or 0})}{(\sum{k \in N(V_u)} w_k)^2} \underbrace{(A^T_{i} A_{j})}{6 \times 6} \right){\Omega} \ \ A_{i} = \begin{bmatrix} ( T_i N_g) \times (T_u V_g - V_c) + ( T_i V_g) \times (T_u N_g) \ T_u V_g + T_u N_g - V_c \end{bmatrix} \end{equation*} $$

Calculating Regularisation Term Jacobian ($J_r$)

Each row in $J_r$ corresponds to derivative of summand for each edge $e$ in the regularisation graph $\epsilon$ and column $k$ corresponds to node $k$ with respect to which the derivative is calculated.

$$ \begin{equation} \displaystyle (J_r){ek} = \sum*{e_{ij} \in \epsilon} \frac{\partial ( T_iV_g^j - T_jV_g^j)}{\partial x_k}

\sum_{e_{ij} \in \epsilon} \begin{cases} \begin{pmatrix} -[T_iV_g^j] & I_{3x3} \end{pmatrix} & \text {if } i = k \ -\begin{pmatrix} -[T_jV_g^j] & I_{3x3} \end{pmatrix} & \text {if } j = k \ 0 & \text {otherwise} \end{cases} \end{equation*} $$

Using this Jacobian we can set up a normal equation corresponding to the regularisation term similarly to the data term as mentioned in the previous section

DynaFu ICP Math

DynaFu ICP Math

Differentiating and Linearising Rt matrices

Estimating Warp Field Parameters

Data term rearrangement

Regularisation term rearrangement

Obtaining normal equation

Calculating Data Term Jacobian ($J_d$)

Alternative calculation for $J_d$

Calculating Regularisation Term Jacobian ($J_r$)

$$ \begin{equation*} \displaystyle (J_r){ek} = \sum{e_{ij} \in \epsilon} \frac{\partial ( T_iV_g^j - T_jV_g^j)}{\partial x_k}

$$ \begin{equation} \displaystyle (J_r){ek} = \sum*{e_{ij} \in \epsilon} \frac{\partial ( T_iV_g^j - T_jV_g^j)}{\partial x_k}