docs/source/act.mdx
ACT is a lightweight and efficient policy for imitation learning, especially well-suited for fine-grained manipulation tasks. It's the first model we recommend when you're starting out with LeRobot due to its fast training time, low computational requirements, and strong performance.
<div class="video-container"> <iframe width="100%" height="415" src="https://www.youtube.com/embed/ft73x0LfGpM" title="LeRobot ACT Tutorial" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen ></iframe> </div>Watch this tutorial from the LeRobot team to learn how ACT works: LeRobot ACT Tutorial
Action Chunking with Transformers (ACT) was introduced in the paper Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware by Zhao et al. The policy was designed to enable precise, contact-rich manipulation tasks using affordable hardware and minimal demonstration data.
ACT stands out as an excellent starting point for several reasons:
ACT uses a transformer-based architecture with three main components:
The policy takes as input:
z (learned during training, set to zero during inference)And outputs a chunk of k future action sequences.
ACT works seamlessly with the standard LeRobot training pipeline. Here's a complete example for training ACT on your dataset:
lerobot-train \
--dataset.repo_id=${HF_USER}/your_dataset \
--policy.type=act \
--output_dir=outputs/train/act_your_dataset \
--job_name=act_your_dataset \
--policy.device=cuda \
--wandb.enable=true \
--policy.repo_id=${HF_USER}/act_policy
If your local computer doesn't have a powerful GPU, you can utilize Google Colab to train your model by following the ACT training notebook.
Once training is complete, you can evaluate your ACT policy using the lerobot-record command with your trained policy. This will run inference and record evaluation episodes:
lerobot-record \
--robot.type=so100_follower \
--robot.port=/dev/ttyACM0 \
--robot.id=my_robot \
--robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 640, height: 480, fps: 30}}" \
--display_data=true \
--dataset.repo_id=${HF_USER}/eval_act_your_dataset \
--dataset.num_episodes=10 \
--dataset.single_task="Your task description" \
--dataset.streaming_encoding=true \
--dataset.encoder_threads=2 \
# --dataset.vcodec=auto \
--policy.path=${HF_USER}/act_policy