python/least_squares.ipynb
This notebook describes a utility function included in the MuJoCo Python library performing box-bounded nonlinear least squares optimization. We provide some theoretical background, describe our implementation and show example usage. <a href="https://colab.research.google.com/github/google-deepmind/mujoco/blob/main/python/least_squares.ipynb"></a>
<!-- Copyright 2022 DeepMind Technologies Limited Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0. Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --># Install mujoco.
!pip install mujoco
from google.colab import files
import distutils.util
import os
import subprocess
if subprocess.run('nvidia-smi').returncode:
raise RuntimeError(
'Cannot communicate with GPU. '
'Make sure you are using a GPU Colab runtime. '
'Go to the Runtime menu and select Choose runtime type.')
# Add an ICD config so that glvnd can pick up the Nvidia EGL driver.
# This is usually installed as part of an Nvidia driver package, but the Colab
# kernel doesn't install its driver via APT, and as a result the ICD is missing.
# (https://github.com/NVIDIA/libglvnd/blob/master/src/EGL/icd_enumeration.md)
NVIDIA_ICD_CONFIG_PATH = '/usr/share/glvnd/egl_vendor.d/10_nvidia.json'
if not os.path.exists(NVIDIA_ICD_CONFIG_PATH):
with open(NVIDIA_ICD_CONFIG_PATH, 'w') as f:
f.write("""{
"file_format_version" : "1.0.0",
"ICD" : {
"library_path" : "libEGL_nvidia.so.0"
}
}
""")
#
# Configure MuJoCo to use the EGL rendering backend (requires GPU)
print('Setting environment variable to use GPU rendering:')
%env MUJOCO_GL=egl
try:
print('Checking that the installation succeeded:')
import mujoco
from mujoco import minimize
from mujoco import rollout
mujoco.MjModel.from_xml_string('<mujoco/>')
except Exception as e:
raise e from RuntimeError(
'Something went wrong during installation. Check the shell output above '
'for more information.\n'
'If using a hosted Colab runtime, make sure you enable GPU acceleration '
'by going to the Runtime menu and selecting "Choose runtime type".')
print('MuJoCo installation successful.')
print('Installing mediapy:')
!command -v ffmpeg >/dev/null || (apt update && apt install -y ffmpeg)
!pip install -q mediapy
#
from IPython.display import clear_output
clear_output()
# Other imports.
import mediapy as media
import time
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from typing import Tuple, Optional, Union
We begin with a quick primer. If you know this stuff, feel free to skip this section.
Nonlinear optimization (unconstrained, for now) means finding the value of a vector decision variable $x \in \mathbb R^n$, which minimizes the objective $f(x): \mathbb R^n \rightarrow \mathbb R$, here assumed to be a smooth function. This is often written as $$x^* = \arg \min_x f(x).$$
Second-order optimization, a.k.a Newton's method, is an iterative procedure that finds a sequence of candidates $x_k$ which reduce the value of $f(x_k)$, until convergence to the minimum. At every iteration we measure the local 1st and 2nd derivatives of $f(\cdot)$, called the gradient vector $g = \nabla f(x_k)$ and the Hessian matrix $H = \nabla^2 f(x_k)$. With these we have a local quadratic approximation of $f(x_k)$: $$ f(x_k + \delta x) \approx f(x_k) + \delta x^T g + \frac{1}{2} \delta x^T H\delta x $$ By differentiating with respect to $\delta x$, equating to 0 and solving, we hope to jump to the minimium of the quadratic, giving us the next candidate: $$ x_{k+1} = x_k + \delta x = x_k - H^{-1}g $$
Raw Newton's method, in its form above, is not a good idea. There is no guarantee that the candidates $x_{k}$ will converge or indeed that the objective will be reduced at all. This because
Levenberg-Marquardt (LM) regularization solves both these problems by introducing a regularization parameter $\mu \ge 0$: $$ \delta x_{LM} = - (H + \mu I)^{-1}g $$ For $\mu=0$ we get the classic Newton step. For $\mu$ so large that $\mu I \gg H$, we get $\delta x \approx -\tfrac{1}{\mu}g$: a small gradient step, which is guaranteed to reduce the objective for a large enough $\mu$. Run the cell below for a visualization of LM regularization. For the simple 2D quadratic and the initial guess $$ f(x) = \frac{1}{2}x^T \begin{pmatrix} 2 & 0.9\ 0.9 & 1 \end{pmatrix} x, \qquad x_0 = \begin{pmatrix} -0.9\ 0.5 \end{pmatrix}, $$ We plot the trajectory of $x_1(\mu)$, varying the parameter logarithmically: $\mu = 10^{-4} \ldots 10^2$
#@title LM curve visualization
# Problem setup.
H = np.array([[2, 0.9], [0.9, 1]])
x0 = np.array((-0.9, 0.5))
# Logarithmically spaced mu regularizer values.
lm_values = 10**np.linspace(-4, 2, 40)
# Sequence of next candidates x1(mu).
g0 = H.dot(x0)
dx = np.asarray([-np.linalg.inv(H + mu*np.eye(2)).dot(g0) for mu in lm_values])
x1 = x0 + dx
# Grid of function values, contour plot.
res = 400
X,Y = np.meshgrid(np.linspace(-1, 1, res),
np.linspace(-1, 1, res))
Z = 0.5 * (H[0,0]*X**2 + H[1,1]*Y**2 + 2*H[1,0]*X*Y)
fig = plt.figure(figsize=(10,10))
contours = np.linspace(0, np.sqrt(Z.max()), 20)**2
plt.contour(X, Y, Z, contours, linewidths=0.5)
# Draw x0, actual minimum, x1(mu).
point_spec = {'markersize':10, 'markeredgewidth':2, 'fillstyle':'none',
'marker':'o', 'linestyle':'none'}
curve_spec = {'markersize':1.5, 'linewidth':1, 'color':'red'}
plt.plot(x0[0], x0[1], color='grey', **point_spec)
plt.plot(0, 0, color='pink', **point_spec)
plt.plot(x1[:,0], x1[:,1], '-o', **curve_spec)
# Finalize figure.
plt.title('Levenberg–Marquardt curve: $x_1(\mu),\; \mu = 10^{-4} \ldots 10^2$')
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['$x_0$', 'minimum', 'LM trajectory'])
plt.rc('font', family='serif')
In the plot above we can see the "LM curve", starting at the minimum of the quadratic for small $\mu$ and then, as $\mu$ grows, curving towards $x_0$, approaching it along the gradient. Powell's dog-leg trajectory can be considered a piecewise-linear approximation of the LM curve.
Importantly, note how the points clump at the ends of the curve. The range of $\mu$ values where the curve is "valuable", enabling meaningful search, is rather narrow, especially considering that we are varying it by 6 orders of magnitude. This implies that careful control of $\mu$ is critical for efficiency of convergence.
Additional stabillity and robustness can be achieved by limiting the search space. An important type of constraint is the box constraint: $$ \begin{aligned} x^* &= \arg \min_x f(x)\ \textrm{s.t.} &\quad l \preccurlyeq x \preccurlyeq u \end{aligned} $$ Where $l$ and $u$ are respectively lower and upper bound vectors and the inequalities $l \preccurlyeq x \preccurlyeq u$ are read elementwise. Solving for the minimizing $\delta x$ is now a bit more involved than solving a linear system, and the optimization method is now referred to as Sequential Quadratic Programming (SQP). Fortunately MuJoCo has an efficient box-constrained QP solver: the mju_boxQP function.
Additional simplification is achieved by a structured objective: $f(x) = \frac{1}{2} r(x)^T\cdot r(x)$, which is called the Least Squares objective. The vector function $r(x): \mathbb R^n \rightarrow \mathbb R^m$, is called the residual, and the optimization problem is referred to as Nonlinear Least Squares. For readers with a statistics background, the Least Squares problem lends itself to reinterpertation as Maximum Likelihood Estimation of a Gaussian posterior $x^* = \arg \max_x p(x)$ with $p(x) \sim e^{-\frac{1}{2}r(x)^Tr(x)}$. To see why the least-squares structure is beneficial, consider the gradient and Hessian of the objective. Letting $J$ be the $n \times m$ Jacobian matrix of the residual $J = \nabla r(x)$ we have $$ \begin{align} g &= J^Tr\ H &= J^TJ + \nabla J \cdot r \end{align} $$ The Gauss-Newton approximation of the Hessian $H$ involves dropping the second term: $H \approx H_{GN}= J^TJ$. This approximation has two obvious benefits:
This completes the background section and allows us to accurately describe the function explored in the rest of the notebook.
The Least Squares problem described above can be generalized as follows. Instead of the quadratic norm $\frac{1}{2} r(x)^T\cdot r(x)$, we can use some other smooth, convex norm $f(x) = n(r(x))$. Letting $\nabla n = \frac{\partial n}{\partial r}$ and $\nabla^2 n = \frac{\partial^2 n}{\partial r^2}$ be respectively the gradient and Hessian of $n$ with respect to $r$, we have $$ \begin{align} g &= J^T\cdot\nabla n\ H_{GN} &= J^T\cdot \nabla^2 n \cdot J \end{align} $$ It is easy to verify that for the quadratic norm, these expression reduce to the ones above as $\nabla (\frac{1}{2}r^T\cdot r) = r$ and $\nabla^2 (\frac{1}{2}r^T\cdot r) = I_m$.
This completes the background section and allows us to accurately describe the function explored in the rest of the notebook.
minimize.least_squaresThe minimize.least_squares function in the mujoco Python library solves the Box-Constrained Nonlinear Least Squares problem. It uses the Gauss-Newton Hessian approximation, and takes Levenberg-Marquardt search steps, while adjusting the LM parameter $\mu$.
The minimize.least_squares() function is motivated by two problems:
This motivation led us to the following assumptions and design choices.
Let's look at the function's docstring and then discuss some implementation notes.
print(minimize.least_squares.__doc__)
None or fully specified for all dimensions of $x$.jacobian callback can be supplied by the user and is finite-differenced otherwise. Note that this callback is not made available in the case the user knows the analytic Jacobian (this is very rare in the sysID context), but in case the user wants to implement their own multi-threaded fin-diff callback.eps is scaled by the size of the bounds, if provided.norm different than the quadratic norm (the default), this is covered in more detail below.# @title Plotting utility
def plot_2D(residual, name, plot_range, true_minimum, trace=None, bounds=None):
n = 400
x_range, y_range = plot_range
x_grid = np.linspace(x_range[0], x_range[1], n)
y_grid = np.linspace(y_range[0], y_range[1], n)
X, Y = np.meshgrid(x_grid, y_grid)
R = residual(np.stack((X, Y)))
Z = 0.5 * np.sum(R**2, axis=0)
fig = plt.figure(figsize=(10, 10))
cntr_levels = np.linspace(0, np.log1p(Z.max()), 30)
plt.contour(X, Y, np.log1p(Z), cntr_levels, linewidths=0.5)
plt.title(name)
plt.xlabel('x')
plt.ylabel('y')
plt.rc('font', family='serif')
# Draw bounds.
hbounds = None
if bounds is not None:
lower, upper = bounds
width = upper[0] - lower[0]
height = upper[1] - lower[1]
rect = Rectangle(lower, width, height, edgecolor='blue', facecolor='none',
fill=False, lw=1)
hbounds = fig.axes[0].add_patch(rect)
# Draw global minimum, initial point
point_spec = {'markersize': 10, 'markeredgewidth': 2, 'fillstyle': 'none',
'marker': 'o', 'linestyle': 'none'}
curve_spec = {'marker': 'o', 'markersize': 2, 'linewidth': 1.5,
'color': 'red', 'alpha':0.5}
final_spec = {'marker': 'o', 'markersize': 4, 'color': 'green',
'linestyle': 'none'}
hmin = plt.plot(true_minimum[0], true_minimum[1], color='pink', **point_spec)
def add_curve(t):
x0 = plt.plot(t[0, 0], t[0, 1], color='grey', **point_spec)
traj = plt.plot(t[:, 0], t[:, 1], **curve_spec)
final = plt.plot(t[-1, 0], t[-1, 1], **final_spec)
return x0, traj, final
if trace is not None:
if isinstance(trace, list):
for t in trace:
x0, traj, final = add_curve(t)
else:
x0, traj, final = add_curve(trace)
handles = [x0[0], traj[0], hmin[0], final[0]]
labels = ['$x_0$', 'optimization trace', 'true minimum', 'solution']
if hbounds is not None:
handles.append(hbounds)
labels.append('box bounds')
plt.legend(handles, labels)
plt.show()
The classic Rosenbrock test function can be written in least-squares form:
$f(x,y)=r^Tr,$ with the residual $r(x,y)=\begin{pmatrix} 1-x \ 10\cdot(y-x^2) \end{pmatrix}$.
Note that here in the 2D example section, we use $x, y$ to denote the two coordinates of our decision variable. Let's see minimize.least_squares find the minimum:
# Minimize Rosenbrock function.
def rosenbrock(x):
return np.stack([1 - x[0, :], 10 * (x[1, :] - x[0, :] ** 2)])
x0 = np.array((0.0, 0.0))
x, rb_trace = minimize.least_squares(x0, rosenbrock);
#@title Visualize Rosenbrock solution
plot_range = (np.array((-0.5, 1.5)), np.array((-1.5, 2.)))
minimum = (1, 1)
points = np.asarray([t.candidate for t in rb_trace])
plot_2D(rosenbrock, 'Rosenbrock Function', plot_range, minimum, trace=points)
A somewhat more elaborate example is the Beale function:
$f(x,y)=r^Tr,$ with the residual $r(x,y)=\begin{pmatrix} 1.5-x+xy \ 2.25-x+xy^2 \ 2.625-x+xy^3 \end{pmatrix}$.
#@title Minimize, visualize Beale
def beale(x):
return np.stack((1.5-x[0, :]+x[0, :]*x[1, :],
2.25-x[0, :]+x[0, :]*x[1, :]*x[1, :],
2.625-x[0, :]+x[0, :]*x[1, :]*x[1, :]*x[1, :]))
x0 = np.array((-3.0, -3.0))
x, bl_trace = minimize.least_squares(x0, beale)
# Visualize solution.
plot_range = (np.array((-4., 4.)), np.array((-4., 4.)))
minimum = (3., 0.5)
points = np.asarray([t.candidate for t in bl_trace])
plot_2D(beale, 'Beale Function', plot_range, minimum, trace=points)
Let's see solutions for these problems with box bounds:
#@title Rosenbrock with bounds
# Choose bounds.
lower = np.array([-.3, -1.])
upper = np.array([0.9, 1.9])
bounds = [lower, upper]
# Make some initial points, minimize, save taces.
num_points = 4
px = np.linspace(lower[0]+.1, upper[0]-.1, num_points)
py = np.linspace(lower[1]+.1, upper[1]-.1, num_points)
traces = []
for i in range(num_points):
for j in range(num_points):
x0 = np.array((px[j], py[num_points-i-1]))
_, trace = minimize.least_squares(x0, rosenbrock, bounds, verbose=0)
traces.append(np.asarray([t.candidate for t in trace]))
# Plot.
plot_range = (np.array((-0.5, 1.5)), np.array((-1.5, 2.5)))
minimum = (1, 1)
plot_2D(rosenbrock, 'Rosenbrock Function', plot_range, minimum, traces, bounds)
#@title Beale with bounds
# Choose bounds.
lower = np.array([-2, -1.3])
upper = np.array([1.5, 3.])
bounds = [lower, upper]
# Make some initial points, minimize, save taces.
num_points = 5
p0 = np.linspace(lower[0]+.4, upper[0]-.4, num_points)
p1 = np.linspace(lower[1]+.4, upper[1]-.4, num_points)
traces = []
for i in range(num_points):
for j in range(num_points):
x0 = np.array((p0[j], p1[i]))
x, trace = minimize.least_squares(x0, beale, bounds, verbose=0)
traces.append(np.asarray([t.candidate for t in trace]))
# Plot.
plot_range = (np.array((-3., 3.5)), np.array((-3., 4.)))
minimum = (3., 0.5)
plot_2D(beale, 'Beale Function', plot_range, minimum, traces, bounds)
An $n$-dimensional generalization of the 2D rosenbrock function is $$ f(x) = \sum_{i=1}^{n-1} \left[ 100 \cdot (x_{i+1} - x_{i}^{2})^{2} + \left(1 - x_{i}\right)^{2}\right], $$ which can also be written in Least Squares form and solved by our optimizer:
n = 20
def rosenbrock_n(x):
res0 = [1 - x[i, :] for i in range(n - 1)]
res1 = [10 * (x[i, :] - x[i + 1, :] ** 2) for i in range(n - 1)]
return np.asarray(res0 + res1)
x0 = np.zeros(n)
x, _ = minimize.least_squares(x0, rosenbrock_n);
# Expected solution is a vector of 1's
assert np.linalg.norm(x-1) < 1e-8
Forward kinematics means computing the Cartesian poses of articulated bodies, given joint angles. This computation is easy and well defined. Inverse Kinematics (IK) tries to do the opposite: Given a desired Cartesian pose of an articulated body (usually called the end effector), find the joint angles which put the end effector at this pose. The IK problem can easily be over or under determined:
We adress both of these situations in our implementation below, but first mention features which are currently not implemented, but could be added in the future.
least_squares function (explicit support for non-Cartesian tangent spaces).Below, we copy the MuJoCo Menagerie's model of Franka's Panda, with the following modifications:
#@title Panda XML
panda = '''
<mujoco model="panda scene">
<compiler angle="radian" autolimits="true"/>
<option integrator="implicitfast"/>
<statistic center="0.3 0 0.4" extent="1" meansize=".15"/>
<visual>
<headlight diffuse="0.6 0.6 0.6" ambient="0.3 0.3 0.3" specular="0 0 0"/>
<rgba haze="0.15 0.25 0.35 1"/>
<global azimuth="120" elevation="-20"/>
</visual>
<default>
<geom condim="1"/>
<default class="panda">
<material specular="0.5" shininess="0.25"/>
<joint armature="0.1" damping="1" axis="0 0 1" range="-2.8973 2.8973"/>
<general dyntype="none" biastype="affine" ctrlrange="-2.8973 2.8973" forcerange="-87 87"/>
<default class="finger">
<joint axis="0 1 0" type="slide" range="0 0.04"/>
</default>
<default class="visual">
<geom type="mesh" contype="0" conaffinity="0" group="2"/>
<default class="target">
<geom type="box" size=".015"/>
</default>
</default>
<default class="collision">
<geom type="mesh" group="3"/>
<default class="fingertip_pad_collision_1">
<geom type="box" size="0.0085 0.004 0.0085" pos="0 0.0055 0.0445"/>
</default>
<default class="fingertip_pad_collision_2">
<geom type="box" size="0.003 0.002 0.003" pos="0.0055 0.002 0.05"/>
</default>
<default class="fingertip_pad_collision_3">
<geom type="box" size="0.003 0.002 0.003" pos="-0.0055 0.002 0.05"/>
</default>
<default class="fingertip_pad_collision_4">
<geom type="box" size="0.003 0.002 0.0035" pos="0.0055 0.002 0.0395"/>
</default>
<default class="fingertip_pad_collision_5">
<geom type="box" size="0.003 0.002 0.0035" pos="-0.0055 0.002 0.0395"/>
</default>
</default>
</default>
</default>
<asset>
<texture type="skybox" builtin="gradient" rgb1="0.3 0.5 0.7" rgb2="0 0 0" width="512" height="3072"/>
<texture type="2d" name="groundplane" builtin="checker" mark="edge" rgb1="0.2 0.3 0.4" rgb2="0.1 0.2 0.3"
markrgb="0.8 0.8 0.8" width="300" height="300"/>
<material name="groundplane" texture="groundplane" texuniform="true" texrepeat="5 5" reflectance="0.2"/>
</asset>
<asset>
<material class="panda" name="white" rgba="1 1 1 1"/>
<material class="panda" name="off_white" rgba="0.901961 0.921569 0.929412 1"/>
<material class="panda" name="black" rgba="0.25 0.25 0.25 1"/>
<material class="panda" name="green" rgba="0 1 0 1"/>
<material class="panda" name="light_blue" rgba="0.039216 0.541176 0.780392 1"/>
<!-- Collision meshes -->
<mesh name="link0_c" file="link0.stl"/>
<mesh name="link1_c" file="link1.stl"/>
<mesh name="link2_c" file="link2.stl"/>
<mesh name="link3_c" file="link3.stl"/>
<mesh name="link4_c" file="link4.stl"/>
<mesh name="link5_c0" file="link5_collision_0.obj"/>
<mesh name="link5_c1" file="link5_collision_1.obj"/>
<mesh name="link5_c2" file="link5_collision_2.obj"/>
<mesh name="link6_c" file="link6.stl"/>
<mesh name="link7_c" file="link7.stl"/>
<mesh name="hand_c" file="hand.stl"/>
<!-- Visual meshes -->
<mesh file="link0_0.obj"/>
<mesh file="link0_1.obj"/>
<mesh file="link0_2.obj"/>
<mesh file="link0_3.obj"/>
<mesh file="link0_4.obj"/>
<mesh file="link0_5.obj"/>
<mesh file="link0_7.obj"/>
<mesh file="link0_8.obj"/>
<mesh file="link0_9.obj"/>
<mesh file="link0_10.obj"/>
<mesh file="link0_11.obj"/>
<mesh file="link1.obj"/>
<mesh file="link2.obj"/>
<mesh file="link3_0.obj"/>
<mesh file="link3_1.obj"/>
<mesh file="link3_2.obj"/>
<mesh file="link3_3.obj"/>
<mesh file="link4_0.obj"/>
<mesh file="link4_1.obj"/>
<mesh file="link4_2.obj"/>
<mesh file="link4_3.obj"/>
<mesh file="link5_0.obj"/>
<mesh file="link5_1.obj"/>
<mesh file="link5_2.obj"/>
<mesh file="link6_0.obj"/>
<mesh file="link6_1.obj"/>
<mesh file="link6_2.obj"/>
<mesh file="link6_3.obj"/>
<mesh file="link6_4.obj"/>
<mesh file="link6_5.obj"/>
<mesh file="link6_6.obj"/>
<mesh file="link6_7.obj"/>
<mesh file="link6_8.obj"/>
<mesh file="link6_9.obj"/>
<mesh file="link6_10.obj"/>
<mesh file="link6_11.obj"/>
<mesh file="link6_12.obj"/>
<mesh file="link6_13.obj"/>
<mesh file="link6_14.obj"/>
<mesh file="link6_15.obj"/>
<mesh file="link6_16.obj"/>
<mesh file="link7_0.obj"/>
<mesh file="link7_1.obj"/>
<mesh file="link7_2.obj"/>
<mesh file="link7_3.obj"/>
<mesh file="link7_4.obj"/>
<mesh file="link7_5.obj"/>
<mesh file="link7_6.obj"/>
<mesh file="link7_7.obj"/>
<mesh file="hand_0.obj"/>
<mesh file="hand_1.obj"/>
<mesh file="hand_2.obj"/>
<mesh file="hand_3.obj"/>
<mesh file="hand_4.obj"/>
<mesh file="finger_0.obj"/>
<mesh file="finger_1.obj"/>
</asset>
<worldbody>
<light pos="0 0 1.5" dir="0 0 -1" directional="true"/>
<geom name="floor" size="0 0 0.05" type="plane" material="groundplane"/>
<light name="top" pos="0 0 2" mode="trackcom"/>
<body name="link0" childclass="panda">
<inertial mass="0.629769" pos="-0.041018 -0.00014 0.049974"
fullinertia="0.00315 0.00388 0.004285 8.2904e-7 0.00015 8.2299e-6"/>
<geom mesh="link0_0" material="off_white" class="visual"/>
<geom mesh="link0_1" material="black" class="visual"/>
<geom mesh="link0_2" material="off_white" class="visual"/>
<geom mesh="link0_3" material="black" class="visual"/>
<geom mesh="link0_4" material="off_white" class="visual"/>
<geom mesh="link0_5" material="black" class="visual"/>
<geom mesh="link0_7" material="white" class="visual"/>
<geom mesh="link0_8" material="white" class="visual"/>
<geom mesh="link0_9" material="black" class="visual"/>
<geom mesh="link0_10" material="off_white" class="visual"/>
<geom mesh="link0_11" material="white" class="visual"/>
<geom mesh="link0_c" class="collision"/>
<body name="link1" pos="0 0 0.333">
<inertial mass="4.970684" pos="0.003875 0.002081 -0.04762"
fullinertia="0.70337 0.70661 0.0091170 -0.00013900 0.0067720 0.019169"/>
<joint name="joint1"/>
<geom material="white" mesh="link1" class="visual"/>
<geom mesh="link1_c" class="collision"/>
<body name="link2" quat="1 -1 0 0">
<inertial mass="0.646926" pos="-0.003141 -0.02872 0.003495"
fullinertia="0.0079620 2.8110e-2 2.5995e-2 -3.925e-3 1.0254e-2 7.04e-4"/>
<joint name="joint2" range="-1.7628 1.7628"/>
<geom material="white" mesh="link2" class="visual"/>
<geom mesh="link2_c" class="collision"/>
<body name="link3" pos="0 -0.316 0" quat="1 1 0 0">
<joint name="joint3"/>
<inertial mass="3.228604" pos="2.7518e-2 3.9252e-2 -6.6502e-2"
fullinertia="3.7242e-2 3.6155e-2 1.083e-2 -4.761e-3 -1.1396e-2 -1.2805e-2"/>
<geom mesh="link3_0" material="white" class="visual"/>
<geom mesh="link3_1" material="white" class="visual"/>
<geom mesh="link3_2" material="white" class="visual"/>
<geom mesh="link3_3" material="black" class="visual"/>
<geom mesh="link3_c" class="collision"/>
<body name="link4" pos="0.0825 0 0" quat="1 1 0 0">
<inertial mass="3.587895" pos="-5.317e-2 1.04419e-1 2.7454e-2"
fullinertia="2.5853e-2 1.9552e-2 2.8323e-2 7.796e-3 -1.332e-3 8.641e-3"/>
<joint name="joint4" range="-3.0718 -0.0698"/>
<geom mesh="link4_0" material="white" class="visual"/>
<geom mesh="link4_1" material="white" class="visual"/>
<geom mesh="link4_2" material="black" class="visual"/>
<geom mesh="link4_3" material="white" class="visual"/>
<geom mesh="link4_c" class="collision"/>
<body name="link5" pos="-0.0825 0.384 0" quat="1 -1 0 0">
<inertial mass="1.225946" pos="-1.1953e-2 4.1065e-2 -3.8437e-2"
fullinertia="3.5549e-2 2.9474e-2 8.627e-3 -2.117e-3 -4.037e-3 2.29e-4"/>
<joint name="joint5"/>
<geom mesh="link5_0" material="black" class="visual"/>
<geom mesh="link5_1" material="white" class="visual"/>
<geom mesh="link5_2" material="white" class="visual"/>
<geom mesh="link5_c0" class="collision"/>
<geom mesh="link5_c1" class="collision"/>
<geom mesh="link5_c2" class="collision"/>
<body name="link6" quat="1 1 0 0">
<inertial mass="1.666555" pos="6.0149e-2 -1.4117e-2 -1.0517e-2"
fullinertia="1.964e-3 4.354e-3 5.433e-3 1.09e-4 -1.158e-3 3.41e-4"/>
<joint name="joint6" range="-0.0175 3.7525"/>
<geom mesh="link6_0" material="off_white" class="visual"/>
<geom mesh="link6_1" material="white" class="visual"/>
<geom mesh="link6_2" material="black" class="visual"/>
<geom mesh="link6_3" material="white" class="visual"/>
<geom mesh="link6_4" material="white" class="visual"/>
<geom mesh="link6_5" material="white" class="visual"/>
<geom mesh="link6_6" material="white" class="visual"/>
<geom mesh="link6_7" material="light_blue" class="visual"/>
<geom mesh="link6_8" material="light_blue" class="visual"/>
<geom mesh="link6_9" material="black" class="visual"/>
<geom mesh="link6_10" material="black" class="visual"/>
<geom mesh="link6_11" material="white" class="visual"/>
<geom mesh="link6_12" material="green" class="visual"/>
<geom mesh="link6_13" material="white" class="visual"/>
<geom mesh="link6_14" material="black" class="visual"/>
<geom mesh="link6_15" material="black" class="visual"/>
<geom mesh="link6_16" material="white" class="visual"/>
<geom mesh="link6_c" class="collision"/>
<body name="link7" pos="0.088 0 0" quat="1 1 0 0">
<inertial mass="7.35522e-01" pos="1.0517e-2 -4.252e-3 6.1597e-2"
fullinertia="1.2516e-2 1.0027e-2 4.815e-3 -4.28e-4 -1.196e-3 -7.41e-4"/>
<joint name="joint7"/>
<geom mesh="link7_0" material="white" class="visual"/>
<geom mesh="link7_1" material="black" class="visual"/>
<geom mesh="link7_2" material="black" class="visual"/>
<geom mesh="link7_3" material="black" class="visual"/>
<geom mesh="link7_4" material="black" class="visual"/>
<geom mesh="link7_5" material="black" class="visual"/>
<geom mesh="link7_6" material="black" class="visual"/>
<geom mesh="link7_7" material="white" class="visual"/>
<geom mesh="link7_c" class="collision"/>
<body name="hand" pos="0 0 0.107" quat="0.9238795 0 0 -0.3826834">
<inertial mass="0.73" pos="-0.01 0 0.03" diaginertia="0.001 0.0025 0.0017"/>
<geom mesh="hand_0" material="off_white" class="visual"/>
<geom mesh="hand_1" material="black" class="visual"/>
<geom mesh="hand_2" material="black" class="visual"/>
<geom mesh="hand_3" material="white" class="visual"/>
<geom mesh="hand_4" material="off_white" class="visual"/>
<geom mesh="hand_c" class="collision"/>
<site name="effector" pos="0 0 0.08"/>
</body>
</body>
</body>
</body>
</body>
</body>
</body>
</body>
</body>
<body name="target" pos="0 .6 .4" quat=".5 -.2 1 0" mocap="true">
<site name="target" group="2"/>
<geom class="target" fromto=".07 0 0 .15 0 0" rgba="1 0 0 1"/>
<geom class="target" fromto="0 .07 0 0 .15 0" rgba="0 1 0 1"/>
<geom class="target" fromto="0 0 .07 0 0 .15" rgba="0 0 1 1"/>
</body>
</worldbody>
<actuator>
<general class="panda" name="actuator1" joint="joint1" gainprm="4500" biasprm="0 -4500 -450"/>
<general class="panda" name="actuator2" joint="joint2" gainprm="4500" biasprm="0 -4500 -450"
ctrlrange="-1.7628 1.7628"/>
<general class="panda" name="actuator3" joint="joint3" gainprm="3500" biasprm="0 -3500 -350"/>
<general class="panda" name="actuator4" joint="joint4" gainprm="3500" biasprm="0 -3500 -350"
ctrlrange="-3.0718 -0.0698"/>
<general class="panda" name="actuator5" joint="joint5" gainprm="2000" biasprm="0 -2000 -200" forcerange="-12 12"/>
<general class="panda" name="actuator6" joint="joint6" gainprm="2000" biasprm="0 -2000 -200" forcerange="-12 12"
ctrlrange="-0.0175 3.7525"/>
<general class="panda" name="actuator7" joint="joint7" gainprm="2000" biasprm="0 -2000 -200" forcerange="-12 12"/>
</actuator>
<keyframe>
<key name="home" qpos="0 0 0 -1.57079 0 1.57079 -0.7853" ctrl="0 0 0 -1.57079 0 1.57079 -0.7853"/>
</keyframe>
</mujoco>
'''
#@title Get Panda assets
!git clone https://github.com/google-deepmind/mujoco_menagerie
from os import walk
from os.path import join
assets_path = 'mujoco_menagerie/franka_emika_panda/assets/'
asset_names = next(walk(assets_path), (None, None, []))[2]
assets = {}
for name in asset_names:
with open(join(assets_path, name), 'rb') as f:
assets[name] = f.read()
#@title Load model, render {vertical-output: true}
model = mujoco.MjModel.from_xml_string(panda, assets)
data = mujoco.MjData(model)
# Reset the state to the "home" keyframe.
key = mujoco.mj_name2id(model, mujoco.mjtObj.mjOBJ_KEY, 'home')
mujoco.mj_resetDataKeyframe(model, data, key)
mujoco.mj_forward(model, data)
# If a renderer exists, close it.
if 'renderer' in locals():
renderer.close()
# Make a Renderer and a camera.
renderer = mujoco.Renderer(model, height=360, width=480)
camera = mujoco.MjvCamera()
mujoco.mjv_defaultFreeCamera(model, camera)
camera.distance = 1.7
camera.elevation = -15
camera.azimuth = -130
camera.lookat = (0, 0, .3)
# Visualize site frames and labels
voption = mujoco.MjvOption()
voption.frame = mujoco.mjtFrame.mjFRAME_SITE
voption.label = mujoco.mjtLabel.mjLABEL_SITE
renderer.update_scene(data, camera, voption)
voption.label = mujoco.mjtLabel.mjLABEL_NONE
media.show_image(renderer.render())
Note how the target site frame has boxes along the axes, to make it visually distinct from the end-effector frame.
A common feature of the IK problem is that matching poses requires matching both positions and orientations. However, these residuals have different units. Position differences are in units of length (say, Meters), while orientation differences (log-map of a quaternion difference) are in unitless Radians. The user needs to provide a conversion factor denoting how much they "care" about matching orientation vs. position. Since this factor has units of length we call it radius, which can be thought of as the radius of a sphere around the effector frame on the surface of which angular errors are measured.
Our Panda arm has 7 degrees-of-freedom (dofs), making the IK problem under-determined, since the IK problem has 6 constraints (3 translation + 3 rotation). In order to enforce a single solution, we introduce a fixed regularizer attracting the configuration to some base pose.
We are now in a position to see the problem definition:
# Bounds at the joint limits.
bounds = [model.jnt_range[:, 0], model.jnt_range[:, 1]]
# Inital guess is the 'home' keyframe.
x0 = model.key('home').qpos
def ik(x, pos=None, quat=None, radius=0.04, reg=1e-3, reg_target=None):
"""Residual for inverse kinematics.
Args:
x: joint angles.
pos: target position for the end effector.
quat: target orientation for the end effector.
radius: scaling of the 3D cross.
Returns:
The residual of the Inverse Kinematics task.
"""
# Move the mocap body to the target
id = model.body('target').mocapid
data.mocap_pos[id] = model.body('target').pos if pos is None else pos
data.mocap_quat[id] = model.body('target').quat if quat is None else quat
# Set qpos, compute forward kinematics.
res = []
for i in range(x.shape[1]):
data.qpos = x[:, i]
mujoco.mj_kinematics(model, data)
# Position residual.
res_pos = data.site('effector').xpos - data.site('target').xpos
# Effector quat, use mju_mat2quat.
effector_quat = np.empty(4)
mujoco.mju_mat2Quat(effector_quat, data.site('effector').xmat)
# Target quat, exploit the fact that the site is aligned with the body.
target_quat = data.body('target').xquat
# Orientation residual: quaternion difference.
res_quat = np.empty(3)
mujoco.mju_subQuat(res_quat, target_quat, effector_quat)
res_quat *= radius
# Regularization residual.
reg_target = model.key('home').qpos if reg_target is None else reg_target
res_reg = reg * (x[:, i] - reg_target)
res_i = np.hstack((res_pos, res_quat, res_reg))
res.append(np.atleast_2d(res_i).T)
return np.hstack(res)
The IK problem is special in another way. Unlike the general case, here analytic Jacobians are computable by MuJoCo. Let's write down this Jacobian and then compare solution timing using finite differencing with timing using the analytic Jacobian.
def ik_jac(x, res, pos=None, quat=None, radius=.04, reg=1e-3):
"""Analytic Jacobian of inverse kinematics residual
Args:
x: joint angles.
pos: target position for the end effector.
quat: target orientation for the end effector.
radius: scaling of the 3D cross.
Returns:
The Jacobian of the Inverse Kinematics task.
"""
# least_squares() passes the value of the residual at x which is sometimes
# useful, but we don't need it here.
del res
# Call mj_kinematics and mj_comPos (required for Jacobians).
mujoco.mj_kinematics(model, data)
mujoco.mj_comPos(model, data)
# Get end-effector site Jacobian.
jac_pos = np.empty((3, model.nv))
jac_quat = np.empty((3, model.nv))
mujoco.mj_jacSite(model, data, jac_pos, jac_quat, data.site('effector').id)
# Get Deffector, the 3x3 mju_subquat Jacobian
effector_quat = np.empty(4)
mujoco.mju_mat2Quat(effector_quat, data.site('effector').xmat)
target_quat = data.body('target').xquat
Deffector = np.empty((3, 3))
mujoco.mjd_subQuat(target_quat, effector_quat, None, Deffector)
# Rotate into target frame, multiply by subQuat Jacobian, scale by radius.
target_mat = data.site('target').xmat.reshape(3, 3)
mat = radius * Deffector.T @ target_mat.T
jac_quat = mat @ jac_quat
# Regularization Jacobian.
jac_reg = reg * np.eye(model.nv)
return np.vstack((jac_pos, jac_quat, jac_reg))
Let's compare the performance of the function with finite-differenced and analytic Jacobians:
print('Finite-differenced Jacobian:')
x_fd, _ = minimize.least_squares(x0, ik, bounds, verbose=1);
print('Analytic Jacobian:')
x_analytic, _ = minimize.least_squares(x0, ik, bounds, jacobian=ik_jac,
verbose=1, check_derivatives=True);
# Assert that we got a nearly identical solution
assert np.linalg.norm(x_fd - x_analytic) < 1e-5
Nice speed-up! This will become more pronounced the harder the specific IK problem. We'll do a more comprehensive timing comparison a few cells down (this specific configuration happens to be solved rather slowly).
Note that we passed check_derivatives=True to ask the function to verify that our analytic Jacobian is correct, by making a comparison to the internal finite-difference function at the first timestep.
Let's see what the solution to our problem looks like.
#@title Basic solution {vertical-output: true}
x, _ = minimize.least_squares(x0, ik, bounds, jacobian=ik_jac,
verbose=0);
# Update and visualize
data.qpos = x
mujoco.mj_kinematics(model, data)
mujoco.mj_camlight(model, data)
camera.distance = 1
camera.lookat = data.site('effector').xpos
renderer.update_scene(data, camera, voption)
media.show_image(renderer.render())
In order to get a better intuition for the effects of the radius argument, let's give the IK solver a more difficult target pose, and vary the radius:
#@title Small `radius` {vertical-output: true}
pos = np.array((.2, .8, .5))
quat = np.array((1, 1, 0, 0))
radius = 0.04
ik_target = lambda x: ik(x, pos=pos, quat=quat, radius=radius)
jac_target = lambda x, r: ik_jac(x, r, pos=pos, quat=quat,
radius=radius)
x, _ = minimize.least_squares(x0, ik_target, bounds,
jacobian=jac_target,
verbose=0);
# Update and visualize
data.qpos = x
mujoco.mj_kinematics(model, data)
mujoco.mj_camlight(model, data)
camera.distance = 1
camera.lookat = data.site('effector').xpos
camera.azimuth = -150
camera.elevation = -20
renderer.update_scene(data, camera, voption)
media.show_image(renderer.render())
We can see that the positions match well, but the orientations are very wrong.
#@title Large `radius` {vertical-output: true}
pos = np.array((.2, .8, .5))
quat = np.array((1, 1, 0, 0))
radius = 0.5
ik_target = lambda x: ik(x, pos=pos, quat=quat, radius=radius)
jac_target = lambda x, r: ik_jac(x, r, pos=pos, quat=quat,
radius=radius)
x, _ = minimize.least_squares(x0, ik_target, bounds,
jacobian=jac_target,
verbose=0);
# Update and visualize
data.qpos = x
mujoco.mj_kinematics(model, data)
mujoco.mj_camlight(model, data)
camera.distance = 1
camera.lookat = data.site('effector').xpos
camera.azimuth = -150
camera.elevation = -20
renderer.update_scene(data, camera, voption)
media.show_image(renderer.render())
Now the orientations are a better match, but the position error is much larger.
It is often the case that one wants to smoothly track a given end-effector trajectory. Let's invent some smooth traget trajectory:
# @title Continuous target trajectory {vertical-output: true}
def pose(time):
pos = (0.4 * np.sin(time),
0.4 * np.cos(time),
0.4 + 0.2 * np.sin(3 * time))
quat = np.array((1.0, np.sin(2 * time), np.sin(time), 0))
quat /= np.linalg.norm(quat)
return pos, quat
# Number of frames
n_frame = 400
# Reset the camera, make the arm point straight up.
camera.distance = 1.5
camera.elevation = -15
camera.azimuth = -130
camera.lookat = (0, 0, 0.3)
mujoco.mj_resetData(model, data)
frames = []
for t in np.linspace(0, 2 * np.pi, n_frame):
# Move the target
pos, quat = pose(t)
id = model.body('target').mocapid
data.mocap_pos[id] = pos
data.mocap_quat[id] = quat
mujoco.mj_kinematics(model, data)
mujoco.mj_camlight(model, data)
renderer.update_scene(data, camera, voption)
frames.append(renderer.render())
media.show_video(frames)
Let's see what the arm motion looks like if we solve the IK problem independently for each frame, always starting at our x0 initial guess. While we're doing this, we'll also time our solution, with and without analytic derivatives, to get a better sense of how long the optimization takes.
#@title Full IK for every frame {vertical-output: true}
frames = []
time_fd = []
time_analytic = []
for t in np.linspace(0, 2 * np.pi, n_frame):
# Get target pose
pos, quat = pose(t)
# Define IK problem
ik_target = lambda x: ik(x, pos=pos, quat=quat)
jac_target = lambda x, r: ik_jac(x, r, pos=pos, quat=quat)
# Solve while timing, fin-diff Jacobian
t_start = time.time()
x, _ = minimize.least_squares(x0, ik_target, bounds,
verbose=0);
time_fd.append(1000*(time.time() - t_start))
# Solve while timing, analytic Jacobian
t_start = time.time()
x, _ = minimize.least_squares(x0, ik_target, bounds,
jacobian=jac_target,
verbose=0);
time_analytic.append(1000*(time.time() - t_start))
mujoco.mj_kinematics(model, data)
mujoco.mj_camlight(model, data)
renderer.update_scene(data, camera, voption)
frames.append(renderer.render())
media.show_video(frames, loop=False)
mean_analytic = np.asarray(time_analytic).mean()
mean_fd = np.asarray(time_fd).mean()
prfx = 'Mean solution time using '
print(prfx + f'fin-diff Jacobian: {mean_fd:3.1f}ms')
print(prfx + f'analytic Jacobian: {mean_analytic:3.1f}ms')
Yikes! Solutions are being found, but what's with all the "glitches"? The IK problem often has multiple local minima and since we are solving from scratch each time, we end up in a different one, somewhat arbitrarily.
A very easy way to mitigate this is to initialize ("warmstart") the solver with the previous solution:
#@title Warmstart with previous solution {vertical-output: true}
frames = []
x = x0
for t in np.linspace(0, 2 * np.pi, n_frame):
# Get target pose
pos, quat = pose(t)
# Define IK problem
ik_target = lambda x: ik(x, pos=pos, quat=quat)
jac_target = lambda x, r: ik_jac(x, r, pos=pos, quat=quat)
x, _ = minimize.least_squares(x, ik_target, bounds,
jacobian=jac_target,
verbose=0);
mujoco.mj_kinematics(model, data)
mujoco.mj_camlight(model, data)
renderer.update_scene(data, camera, voption)
frames.append(renderer.render())
media.show_video(frames, loop=False)
This looks much better, but there are still a few "glitches". An esy way to mitigate those is to modify the regularization target to also be the previous solution. We are effectively telling the solver not to stray too far.
#@title Warmstart with and regularize to previous solution {vertical-output: true}
frames = []
x = x0
for t in np.linspace(0, 2 * np.pi, n_frame):
# Get target pose
pos, quat = pose(t)
x_prev = x.copy()
# Define IK problem
ik_target = lambda x: ik(x, pos=pos, quat=quat,
reg_target=x_prev, reg=.1)
jac_target = lambda x, r: ik_jac(x, r, pos=pos, quat=quat)
x, _ = minimize.least_squares(x, ik_target, bounds,
jacobian=jac_target,
verbose=0);
mujoco.mj_kinematics(model, data)
mujoco.mj_camlight(model, data)
renderer.update_scene(data, camera, voption)
frames.append(renderer.render())
media.show_video(frames, loop=False)
We'll now use least_squares to solve a trajectory optimization (control) problem. While this is not its intended purpose, it demonstrates the power and general usability of the function.
After using regular Least Squares, we'll define a custom non-quadratic norm and solve again.
Below, we copy MuJoCo's standard humanoid model, with the following modifications:
#@title Humanoid XML
xml = """
<mujoco model="Humanoid">
<option timestep="0.005"/>
<visual>
<map force="0.1" zfar="30"/>
<rgba haze="0.15 0.25 0.35 1"/>
<global offwidth="2560" offheight="1440" elevation="-20" azimuth="120"/>
</visual>
<statistic center="0 0 0.7"/>
<asset>
<texture type="skybox" builtin="gradient" rgb1=".3 .5 .7" rgb2="0 0 0" width="32" height="512"/>
<texture name="body" type="cube" builtin="flat" mark="cross" width="128" height="128" rgb1="0.8 0.6 0.4" rgb2="0.8 0.6 0.4" markrgb="1 1 1" random="0.01"/>
<material name="body" texture="body" texuniform="true" rgba="0.8 0.6 .4 1"/>
<texture name="grid" type="2d" builtin="checker" width="512" height="512" rgb1=".1 .2 .3" rgb2=".2 .3 .4"/>
<material name="grid" texture="grid" texrepeat="1 1" texuniform="true" reflectance=".2"/>
</asset>
<default>
<position inheritrange="0.95"/>
<default class="body">
<!-- geoms -->
<geom type="capsule" condim="1" friction=".7" solimp="0.9 .99 .003" solref=".015 1" material="body" group="1"/>
<default class="thigh">
<geom size=".06"/>
</default>
<default class="shin">
<geom fromto="0 0 0 0 0 -.3" size=".049"/>
</default>
<default class="foot">
<geom size=".027"/>
<default class="foot1">
<geom fromto="-.07 -.01 0 .14 -.03 0"/>
</default>
<default class="foot2">
<geom fromto="-.07 .01 0 .14 .03 0"/>
</default>
</default>
<default class="arm_upper">
<geom size=".04"/>
</default>
<default class="arm_lower">
<geom size=".031"/>
</default>
<default class="hand">
<geom type="sphere" size=".04"/>
</default>
<!-- joints -->
<joint type="hinge" damping=".2" stiffness="1" armature=".01" limited="true" solimplimit="0 .99 .01"/>
<default class="joint_big">
<joint damping="5" stiffness="10"/>
<default class="hip_x">
<joint range="-30 10"/>
</default>
<default class="hip_z">
<joint range="-60 35"/>
</default>
<default class="hip_y">
<joint axis="0 1 0" range="-150 20"/>
</default>
<default class="joint_big_stiff">
<joint stiffness="20"/>
</default>
</default>
<default class="knee">
<joint pos="0 0 .02" axis="0 -1 0" range="-160 2"/>
</default>
<default class="ankle">
<joint range="-50 50"/>
<default class="ankle_y">
<joint pos="0 0 .08" axis="0 1 0" stiffness="6"/>
</default>
<default class="ankle_x">
<joint pos="0 0 .04" stiffness="3"/>
</default>
</default>
<default class="shoulder">
<joint range="-85 60"/>
</default>
<default class="elbow">
<joint range="-100 50" stiffness="0"/>
</default>
</default>
</default>
<worldbody>
<body name="target" pos=".2 -.2 1" mocap="true">
<site name="target" size=".05" rgba="1 0 1 .4"/>
</body>
<geom name="floor" size="0 0 .05" type="plane" material="grid" condim="3"/>
<light name="spotlight" mode="targetbodycom" target="torso" diffuse=".8 .8 .8" specular="0.3 0.3 0.3" pos="0 -6 4" cutoff="30"/>
<body name="torso" pos="0 0 1.282" childclass="body">
<light name="top" pos="0 0 2" mode="trackcom"/>
<camera name="back" pos="-3 0 1" xyaxes="0 -1 0 1 0 2" mode="trackcom"/>
<camera name="side" pos="0 -3 1" xyaxes="1 0 0 0 1 2" mode="trackcom"/>
<freejoint name="root"/>
<geom name="torso" fromto="0 -.07 0 0 .07 0" size=".07"/>
<geom name="waist_upper" fromto="-.01 -.06 -.12 -.01 .06 -.12" size=".06"/>
<body name="head" pos="0 0 .19">
<geom name="head" type="sphere" size=".09"/>
<camera name="egocentric" pos=".09 0 0" xyaxes="0 -1 0 .1 0 1" fovy="80"/>
</body>
<body name="waist_lower" pos="-.01 0 -.26">
<geom name="waist_lower" fromto="0 -.06 0 0 .06 0" size=".06"/>
<joint name="abdomen_z" pos="0 0 .065" axis="0 0 1" range="-45 45" class="joint_big_stiff"/>
<joint name="abdomen_y" pos="0 0 .065" axis="0 1 0" range="-75 30" class="joint_big"/>
<body name="pelvis" pos="0 0 -.165">
<joint name="abdomen_x" pos="0 0 .1" axis="1 0 0" range="-35 35" class="joint_big"/>
<geom name="butt" fromto="-.02 -.07 0 -.02 .07 0" size=".09"/>
<body name="thigh_right" pos="0 -.1 -.04">
<joint name="hip_x_right" axis="1 0 0" class="hip_x"/>
<joint name="hip_z_right" axis="0 0 1" class="hip_z"/>
<joint name="hip_y_right" class="hip_y"/>
<geom name="thigh_right" fromto="0 0 0 0 .01 -.34" class="thigh"/>
<body name="shin_right" pos="0 .01 -.4">
<joint name="knee_right" class="knee"/>
<geom name="shin_right" class="shin"/>
<body name="foot_right" pos="0 0 -.39">
<joint name="ankle_y_right" class="ankle_y"/>
<joint name="ankle_x_right" class="ankle_x" axis="1 0 .5"/>
<geom name="foot1_right" class="foot1"/>
<geom name="foot2_right" class="foot2"/>
</body>
</body>
</body>
<body name="thigh_left" pos="0 .1 -.04">
<joint name="hip_x_left" axis="-1 0 0" class="hip_x"/>
<joint name="hip_z_left" axis="0 0 -1" class="hip_z"/>
<joint name="hip_y_left" class="hip_y"/>
<geom name="thigh_left" fromto="0 0 0 0 -.01 -.34" class="thigh"/>
<body name="shin_left" pos="0 -.01 -.4">
<joint name="knee_left" class="knee"/>
<geom name="shin_left" fromto="0 0 0 0 0 -.3" class="shin"/>
<body name="foot_left" pos="0 0 -.39">
<joint name="ankle_y_left" class="ankle_y"/>
<joint name="ankle_x_left" class="ankle_x" axis="-1 0 -.5"/>
<geom name="foot1_left" class="foot1"/>
<geom name="foot2_left" class="foot2"/>
</body>
</body>
</body>
</body>
</body>
<body name="upper_arm_right" pos="0 -.17 .06">
<joint name="shoulder1_right" axis="2 1 1" class="shoulder"/>
<joint name="shoulder2_right" axis="0 -1 1" class="shoulder"/>
<geom name="upper_arm_right" fromto="0 0 0 .16 -.16 -.16" class="arm_upper"/>
<body name="lower_arm_right" pos=".18 -.18 -.18">
<joint name="elbow_right" axis="0 -1 1" class="elbow"/>
<geom name="lower_arm_right" fromto=".01 .01 .01 .17 .17 .17" class="arm_lower"/>
<body name="hand_right" pos=".18 .18 .18">
<geom name="hand_right" zaxis="1 1 1" class="hand" rgba="1 0 1 1"/>
</body>
</body>
</body>
<body name="upper_arm_left" pos="0 .17 .06">
<joint name="shoulder1_left" axis="-2 1 -1" class="shoulder"/>
<joint name="shoulder2_left" axis="0 -1 -1" class="shoulder"/>
<geom name="upper_arm_left" fromto="0 0 0 .16 .16 -.16" class="arm_upper"/>
<body name="lower_arm_left" pos=".18 .18 -.18">
<joint name="elbow_left" axis="0 -1 -1" class="elbow"/>
<geom name="lower_arm_left" fromto=".01 -.01 .01 .17 -.17 .17" class="arm_lower"/>
<body name="hand_left" pos=".18 -.18 .18">
<geom name="hand_left" zaxis="1 -1 1" class="hand"/>
</body>
</body>
</body>
</body>
</worldbody>
<contact>
<exclude body1="waist_lower" body2="thigh_right"/>
<exclude body1="waist_lower" body2="thigh_left"/>
</contact>
<tendon>
<fixed name="hamstring_right" limited="true" range="-0.3 2">
<joint joint="hip_y_right" coef=".5"/>
<joint joint="knee_right" coef="-.5"/>
</fixed>
<fixed name="hamstring_left" limited="true" range="-0.3 2">
<joint joint="hip_y_left" coef=".5"/>
<joint joint="knee_left" coef="-.5"/>
</fixed>
</tendon>
<actuator>
<position name="abdomen_z" kp="40" joint="abdomen_z"/>
<position name="abdomen_y" kp="40" joint="abdomen_y"/>
<position name="abdomen_x" kp="40" joint="abdomen_x"/>
<position name="hip_x_right" kp="40" joint="hip_x_right"/>
<position name="hip_z_right" kp="40" joint="hip_z_right"/>
<position name="hip_y_right" kp="120" joint="hip_y_right"/>
<position name="knee_right" kp="80" joint="knee_right"/>
<position name="ankle_y_right" kp="20" joint="ankle_y_right"/>
<position name="ankle_x_right" kp="20" joint="ankle_x_right"/>
<position name="hip_x_left" kp="40" joint="hip_x_left"/>
<position name="hip_z_left" kp="40" joint="hip_z_left"/>
<position name="hip_y_left" kp="120" joint="hip_y_left"/>
<position name="knee_left" kp="80" joint="knee_left"/>
<position name="ankle_y_left" kp="20" joint="ankle_y_left"/>
<position name="ankle_x_left" kp="20" joint="ankle_x_left"/>
<position name="shoulder1_right" kp="20" joint="shoulder1_right"/>
<position name="shoulder2_right" kp="20" joint="shoulder2_right"/>
<position name="elbow_right" kp="40" joint="elbow_right"/>
<position name="shoulder1_left" kp="20" joint="shoulder1_left"/>
<position name="shoulder2_left" kp="20" joint="shoulder2_left"/>
<position name="elbow_left" kp="40" joint="elbow_left"/>
</actuator>
<sensor>
<framepos objtype="geom" objname="hand_right" reftype="xbody" refname="target"/>
<actuatorfrc actuator="abdomen_z"/>
<actuatorfrc actuator="abdomen_y"/>
<actuatorfrc actuator="abdomen_x"/>
<actuatorfrc actuator="hip_x_right"/>
<actuatorfrc actuator="hip_z_right"/>
<actuatorfrc actuator="hip_y_right"/>
<actuatorfrc actuator="knee_right"/>
<actuatorfrc actuator="ankle_y_right"/>
<actuatorfrc actuator="ankle_x_right"/>
<actuatorfrc actuator="hip_x_left"/>
<actuatorfrc actuator="hip_z_left"/>
<actuatorfrc actuator="hip_y_left"/>
<actuatorfrc actuator="knee_left"/>
<actuatorfrc actuator="ankle_y_left"/>
<actuatorfrc actuator="ankle_x_left"/>
<actuatorfrc actuator="shoulder1_right"/>
<actuatorfrc actuator="shoulder2_right"/>
<actuatorfrc actuator="elbow_right"/>
<actuatorfrc actuator="shoulder1_left"/>
<actuatorfrc actuator="shoulder2_left"/>
<actuatorfrc actuator="elbow_left"/>
</sensor>
<keyframe>
<!--
The values below are split into rows for readibility:
torso position
torso orientation
spinal
right leg
left leg
arms
-->
<key name="squat"
qpos="0 0 0.596
0.988015 0 0.154359 0
0 0.4 0
-0.25 -0.5 -2.5 -2.65 -0.8 0.56
-0.25 -0.5 -2.5 -2.65 -0.8 0.56
0 0 0 0 0 0"/>
<key name="stand_on_left_leg"
qpos="0 0 1.21948
0.971588 -0.179973 0.135318 -0.0729076
-0.0516 -0.202 0.23
-0.24 -0.007 -0.34 -1.76 -0.466 -0.0415
-0.08 -0.01 -0.37 -0.685 -0.35 -0.09
0.109 -0.067 -0.7 -0.05 0.12 0.16"/>
<key name="prone"
qpos="0.4 0 0.0757706
0.7325 0 0.680767 0
0 0.0729 0
0.0077 0.0019 -0.026 -0.351 -0.27 0
0.0077 0.0019 -0.026 -0.351 -0.27 0
0.56 -0.62 -1.752
0.56 -0.62 -1.752"/>
<key name="supine"
qpos="-0.4 0 0.08122
0.722788 0 -0.69107 0
0 -0.25 0
0.0182 0.0142 0.3 0.042 -0.44 -0.02
0.0182 0.0142 0.3 0.042 -0.44 -0.02
0.186 -0.73 -1.73
0.186 -0.73 -1.73"/>
</keyframe>
</mujoco>
"""
Let's load the model and render the initial state for our control problem, chosen to be the "squat" keyframe.
# Load model, make data
model = mujoco.MjModel.from_xml_string(xml)
data = mujoco.MjData(model)
# Set the state to the "squat" keyframe, call mj_forward.
key = model.key('squat').id
mujoco.mj_resetDataKeyframe(model, data, key)
mujoco.mj_forward(model, data)
# If a renderer exists, close it.
if 'renderer' in locals():
renderer.close()
# Make a Renderer and a camera.
renderer = mujoco.Renderer(model, height=480, width=640)
camera = mujoco.MjvCamera()
mujoco.mjv_defaultFreeCamera(model, camera)
camera.distance = 3
camera.elevation = -10
# Point the camera at the humanoid, render.
camera.lookat = data.body('torso').subtree_com
renderer.update_scene(data, camera)
media.show_image(renderer.render())
We define the following optimal control problem defintion:
Our residual uses mujoco.rollout to evaluate parallel trajectories:
def reach(ctrl0T, target, T, torque_scale, traj=None):
"""Residual for target-reaching task.
Args:
ctrl0T: contatenation of the first and last control vectors.
target: target to which the right hand should reach.
T: final time for the rollout.
torque_scale: coefficient by which to scale the torques.
traj: optional list of positions to be recorded.
Returns:
The residual of the target-reaching task.
"""
# Extract the initial and final ctrl vectors, transpose to row vectors
ctrl0 = ctrl0T[:model.nu, :].T
ctrlT = ctrl0T[model.nu:, :].T
# Move the mocap body to the target
mocapid = model.body('target').mocapid
data.mocap_pos[mocapid] = target
# Append the mocap targets to the controls
nbatch = ctrl0.shape[0]
mocap = np.tile(data.mocap_pos[mocapid], (nbatch, 1))
ctrl0 = np.hstack((ctrl0, mocap))
ctrlT = np.hstack((ctrlT, mocap))
# Define control spec (ctrl + mocap_pos)
mjtState = mujoco.mjtState
control_spec = mjtState.mjSTATE_CTRL | mjtState.mjSTATE_MOCAP_POS
# Interpolate and stack the control sequences
nstep = int(np.round(T / model.opt.timestep))
control = np.stack(np.linspace(ctrl0, ctrlT, nstep), axis=1)
# Reset to the "squat" keyframe, get the initial state
key = model.key('squat').id
mujoco.mj_resetDataKeyframe(model, data, key)
spec = mjtState.mjSTATE_FULLPHYSICS
nstate = mujoco.mj_stateSize(model, spec)
state = np.empty(nstate)
mujoco.mj_getState(model, data, state, spec)
# Perform rollouts (sensors.shape == nbatch, nstep, nsensordata)
states, sensors = rollout.rollout(model, data, state, control,
control_spec=control_spec)
# If requested, extract qpos into traj
if traj is not None:
assert states.shape[0] == 1
# Skip the first element in state (mjData.time)
traj.extend(np.split(states[0, :, 1:model.nq+1], nstep))
# Scale torque sensors
sensors[:, :, 3:] *= torque_scale
# Reshape to stack the sensor values, transpose to column vectors
sensors = sensors.reshape((sensors.shape[0], -1)).T
# The normalizer keeps objective values similar when changing T or timestep.
normalizer = 100 * model.opt.timestep / T
return normalizer * sensors
Now let's give the rest of the problem definition:
mjData.ctrl, the bounds are two concatenated copies of mjData.actuator_ctrlrange.T = 0.7 # Rollout length (seconds)
torque_scale = 0.003 # Scaling for the torques
# Bounds are the stacked control bounds.
lower = np.atleast_2d(model.actuator_ctrlrange[:,0]).T
upper = np.atleast_2d(model.actuator_ctrlrange[:,1]).T
bounds = [np.vstack((lower, lower)), np.vstack((upper, upper))]
# Initial guess is midpoint of the bounds
x0 = 0.5 * (bounds[1] + bounds[0])
Let's define a utility function for rendering frames and visualize the initial guess:
def render_solution(x, target):
# Ask reach to save positions to traj.
traj = []
reach(x, target, T, torque_scale, traj=traj);
frames = []
counter = 0
print('Rendering frames:', flush=True, end='')
for qpos in traj:
# Set positions, call mj_forward to update kinematics.
data.qpos = qpos
mujoco.mj_forward(model, data)
# Render and save frames.
camera.lookat = data.body('torso').subtree_com
renderer.update_scene(data, camera)
pixels = renderer.render()
frames.append(pixels)
counter += 1
if counter % 10 == 0:
print(f' {counter}', flush=True, end='')
return frames
# Use default target.
target = data.mocap_pos[model.body('target').mocapid]
# Visualize the initial guess.
media.show_video(render_solution(x0, target))
Let's solve once for some target and look at the optimization printout:
target = (.4, -.3, 1.2)
reach_target = lambda x: reach(x, target, T, torque_scale, traj=None)
r0 = reach_target(x0)
print(f'The decision variable x has size {x0.size}')
print(f'The residual r(x) has size {r0.size}\n')
x, _ = minimize.least_squares(x0, reach_target, bounds);
Let's see what this solution looks like:
media.show_video(render_solution(x, target))
Let's rerun this for several target values and make a video of all of them:
targets = [(0.4, 0., 0.), (0.2, -1., 0.5), (-1., -.3, 1.), (0., -.2, 2.2)]
frames = []
for target in targets:
res_target = lambda x: reach(x, target, T, torque_scale)
print(f'Optimizing for target at {target}', flush=True)
x, trace = minimize.least_squares(x0, res_target, bounds,
verbose=minimize.Verbosity.FINAL)
frames += render_solution(x, target)
print('\n')
print('Making video', flush=True)
media.show_video(frames)
As explained in the background section, Least Squares can be generalized to norms other than the quadratic. We are now in a position to show how to define a non-quadratic norm, which is important in the estimation and system-identification contexts, where long-tailed, disturbance-rejecting distributions are proportional to the exponent of a non-quadratic function.
Let's say that we wish the task residual i.e., the vector from the hand to the target, to be evaluated with the "Smooth L2" function $c(r)$ which, for a given smoothing radius $d \gt 0$ is $$ c(r) = \sqrt{r^T\cdot r + d^2 } - d $$ This function is quadratic in a $d$-sized neighborhood of the origin, and then grows linearly thereafter, like the L2 norm. The first and second derivatives are $$ \begin{align} s&=\sqrt{r^T\cdot r + d^2 }\ g &= \tfrac{\partial c}{\partial r} = \frac{r}{s} \ H &=\tfrac{\partial^2 c}{\partial r^2} = \frac{I_{n_r} - g\cdot g^T}{s} \end{align} $$ There is no particularly good reason to use this norm for this optimization task, it is merely an example.
Let's read the documentation of the minimize.Norm class:
print(minimize.Norm.__doc__)
Our sensors are 3 r_pos residual values for the hand-to-object vector followed by 21 r_torque actuator torques, for a total of ns = 24 sensors. These are concatenated for the entire trajectory, leading to a residual of size 24*N, where N is the number of timesteps in a trajectory. After reshaping and slicing appropriately, the norm implementation looks as follows:
class SmoothL2(minimize.Norm):
def __init__(self):
self.n = model.nsensordata # Equals 24.
self.d = 0.1 # The smoothing radius (length).
def value(self, r):
rr = r.reshape((self.n, -1), order='F')
r_pos = rr[:3, :]
s = np.sqrt(np.sum(r_pos**2, axis=0) + self.d**2)
y_pos = (s - self.d).sum()
r_torque = rr[3:, :]
y_torque = 0.5 * (r_torque.T**2).sum()
return y_pos + y_torque
def grad_hess(self, r, proj):
rr = r.reshape((self.n, -1), order='F')
r_pos = rr[:3, :]
s = np.sqrt(np.sum(r_pos**2, axis=0) + self.d**2)
g_pos = r_pos / s
g_torque = rr[3:, :]
g = np.vstack((g_pos, g_torque))
grad = proj.T @ g.reshape((-1, 1), order='F')
h_proj = proj.copy() # norm Hessian * projection matrix
for i in range(g_pos.shape[1]):
h_i = (np.eye(3) - g_pos[:, i:i+1] @ g_pos[:, i:i+1].T) / s[i]
j = self.n*i
h_proj[j:j+3, :] = h_i @ proj[j:j+3, :]
hess = proj.T @ h_proj
return grad, hess
Before running the optimization, let's ask least_squares to check our norm implementation. We'll do this with a short trajectory simulation time T, to avoid creating huge matrices.
target = (.4, -.3, 1.2)
T_short = 0.02
reach_target = lambda x: reach(x, target, T=T_short,
torque_scale=torque_scale, traj=None)
x, _ = minimize.least_squares(x0, reach_target, bounds, norm=SmoothL2(),
max_iter=1, check_derivatives=True);
Now that we are confident in our implemetation, we can see what the solution looks like:
target = (.4, -.3, 1.2)
reach_target = lambda x: reach(x, target, T, torque_scale, traj=None)
x, _ = minimize.least_squares(x0, reach_target, bounds, norm=SmoothL2());
media.show_video(render_solution(x, target))