scripts/README.md
This directory contains assorted helper scripts used by NautilusTrader’s
developer tooling and CI pipeline. Only one of them (curate-dataset.sh)
needs a brief explanation because it is meant to be executed manually when
curating test-fixture datasets.
curate-dataset.sh – package an external dataset for the test-data bucketcurate-dataset.sh automates the small but repetitive tasks required when we
bring a third-party file into the NautilusTrader test-data bucket:
v1/<slug>/)LICENSE.txt file holding the SPDX identifier or licence URLmetadata.jsonThe result is a self-contained directory ready to upload one-for-one to the S3 bucket (or to commit into the repository if the data size is small).
scripts/curate-dataset.sh <slug> <filename> <download-url> <licence>
slug – sub-directory name (e.g. fi2010_all)filename – the basename you want inside the directory (e.g. Fi2010.zip)download-url – original public URL of the filelicence – short ID or full URL (e.g. CC-BY-SA-4.0)Example – curate the full FI-2010 limit-order-book dataset (all 10 trading days) from a Dropbox mirror:
scripts/curate-dataset.sh fi2010_all Fi2010.zip \
"https://www.dropbox.com/s/6ywf3td7zdrp1n5/Fi2010.zip?dl=1" \
CC-BY-SA-4.0
After the script finishes you will have the following structure ready to commit or upload:
v1/fi2010_all/
├── Fi2010.zip # ≈230 MB, contains day_1 … day_10
├── LICENSE.txt # CC-BY-SA-4.0
└── metadata.json # size, sha256, provenance
You can now reference v1/fi2010_all/Fi2010.zip from tests or example code,
and downstream tooling can verify the checksum.
curl -L --fail --retry 3, so transient network hiccups are
handled automatically.For details on the other helper scripts, run them with -h or read the
inline comments; they are mostly invoked from CI and rarely need manual
execution.