docs/content/howto/query-and-transform/sub_dataset.md
When experimenting with new features it's often practical to work with a subset of data without modifying the original. A sub-dataset references the same underlying RRD files so no data is copied.
The dependencies in this example are contained in rerun-sdk[all].
Simplified setup to launch the local server for demonstration. In practice you'll connect to your cloud instance.
snippet: howto/sub_dataset[setup]
Query the source dataset's manifest for storage URLs per (segment, layer) pair and re-register them into a new dataset.
snippet: howto/sub_dataset[create_sub_dataset]
Select segments by any criteria — a hardcoded list, a slice, or a filtered query based on segment properties or metadata joins.
snippet: howto/sub_dataset[select_segments]
snippet: howto/sub_dataset[create]
snippet: howto/sub_dataset[verify]
Delete the sub-dataset when it is no longer needed. This only removes the dataset entry from the catalog. The underlying RRD storage is not affected.
snippet: howto/sub_dataset[cleanup]