docs/datasets/droid.md
DROID (Distributed Robot Interaction Dataset) is one of the most popular large-scale, in-the-wild robot manipulation dataset with 76,000 demonstration trajectories and 350 hours of interaction data. It was collected across 564 scenes and 86 tasks using the Franka Panda robot platform, and includes synchronized RGB camera streams, camera calibration, and natural language task descriptions.
Daft provides a simple way to explore the raw DROID release as a lazy, episode-level DataFrame with metadata, trajectory files, and camera videos attached as daft.VideoFile columns.
The raw DROID dataset is hosted on Google Cloud Storage at gs://gresearch/robotics/droid_raw (~8.7 TB). By default, [daft.datasets.droid.raw()][daft.datasets.droid.raw] reads from this public bucket, so no credentials are required to get started.
If you prefer to work with a local copy, download episodes with gsutil and pass the local path to raw():
# Download the full raw dataset (~8.7 TB)
gsutil -m cp -r gs://gresearch/robotics/droid_raw /path/to/droid_raw
# Or download a smaller subset for development
gsutil -m cp -r gs://gresearch/robotics/droid_raw/<episode_path> /path/to/droid_raw/
See the official DROID dataset documentation for details on the dataset format and downloading the necessary files for your use case.
The simplest way to get started is to load a small sample of data from the public source:
import daft
# Load a sample of the raw DROID data
daft.datasets.droid.raw().show()
Each row corresponds to one DROID episode. Metadata from each episode's JSON file is unnested into top-level columns, and lazy file references are attached for the trajectory HDF5 file and three MP4 camera recordings.
This is the default behavior. Daft globs metadata_*.json files under the dataset root, reads each episode's metadata, and constructs paths to the associated trajectory and video files:
import daft
df = daft.datasets.droid.raw()
Point path at any directory that mirrors the raw DROID layout (see Episode layout below):
import daft
df = daft.datasets.droid.raw(path="/path/to/droid_raw")
Remote object stores other than GCS are also supported when passed via path with an appropriate [IOConfig][daft.io.IOConfig].
Because raw() returns a lazy DataFrame, you can filter, project, and sample before materializing any video or trajectory data:
import daft
(
daft.datasets.droid.raw()
.where(daft.col("success"))
.where(daft.col("building") == "Ross")
.select("uuid", "current_task", "trajectory_length", "wrist_video")
.limit(10)
)
Each DROID episode is stored in its own directory:
episode/
|---- metadata_<episode_id>.json # Episode metadata (building, task, camera serials, etc.)
|---- trajectory.h5 # Low-dimensional action and proprioception trajectories
|---- recordings/
|---- MP4/
|---- <camera_serial>.mp4
|---- <camera_serial>-stereo.mp4 # Optional stereo views
|---- SVO/
|---- <camera_serial>.svo # Raw ZED SVO recordings
[daft.datasets.droid.raw()][daft.datasets.droid.raw] currently attaches lazy references to:
trajectory: the episode's trajectory.h5 filewrist_video: wrist camera MP4ext1_video: external camera 1 MP4 (often the left view)ext2_video: external camera 2 MP4 (often the right view)Stereo MP4 and raw SVO recordings are not yet exposed as columns.
raw() returns one row per episode with metadata fields unnested from each metadata_*.json file, plus the following key columns:
| Column | Type | Description |
|---|---|---|
episode_dir | String | Path to the episode directory |
uuid | String | Unique episode identifier |
lab | String | Collecting lab |
user | String | Data collector name |
user_id | String | Data collector identifier |
date | Date | Collection date |
timestamp | String | Collection timestamp |
building | String | Building or environment name |
scene_id | Int64 | Scene identifier within the building |
success | Boolean | Whether the demonstration was successful |
current_task | String | Natural language task description |
trajectory_length | Int64 | Number of timesteps in the trajectory |
robot_serial | String | Robot hardware serial number |
wrist_cam_serial | String | Wrist camera serial number |
ext1_cam_serial | String | External camera 1 serial number |
ext2_cam_serial | String | External camera 2 serial number |
wrist_cam_extrinsics | List[Float64] | Wrist camera extrinsics |
ext1_cam_extrinsics | List[Float64] | External camera 1 extrinsics |
ext2_cam_extrinsics | List[Float64] | External camera 2 extrinsics |
trajectory | File | Lazy reference to trajectory.h5 |
wrist_video | VideoFile | Lazy reference to the wrist camera MP4 |
ext1_video | VideoFile | Lazy reference to external camera 1 MP4 |
ext2_video | VideoFile | Lazy reference to external camera 2 MP4 |
Additional path columns from the metadata JSON (such as hdf5_path, wrist_mp4_path, and ext1_mp4_path) are also available as top-level columns.
video_frames][daft.functions.video_frames] and working with daft.VideoFile.daft.File.