plugins/ruflo-workflows/commands/gaia-validate.md
Run pre-submission integrity checks before executing a benchmark or packaging results for the HAL leaderboard.
/gaia validate
/gaia validate --strict
/gaia validate --fix
| Flag | Default | Description |
|---|---|---|
--strict | off | Fail on warnings (not just errors) |
--fix | off | Attempt to auto-fix resolvable issues (e.g., install missing deps) |
--skip-hf | off | Skip the HF dataset connectivity check (useful offline) |
--skip-build | off | Skip the TypeScript build check |
ANTHROPIC_API_KEY — required for model inferenceHF_TOKEN — required to download the GAIA dataset from Hugging FaceGOOGLE_AI_API_KEY — optional; warn if absent (Gemini model support disabled)GOOGLE_CUSTOM_SEARCH_API_KEY + GOOGLE_CUSTOM_SEARCH_CX — optional; warn
if absent (web_search falls back to DuckDuckGo)cd v3/@claude-flow/cli && npx tsc --noEmit
All GAIA benchmark source files must be TS-error-free.
Perform a dry-run fetch of 1 question from the HF GAIA dataset to confirm the token and network path work.
Verify the witness manifest is up to date and valid:
node plugins/ruflo-core/scripts/witness/verify.mjs
Confirm all required benchmark source files exist:
v3/@claude-flow/cli/src/commands/gaia-bench.tsv3/@claude-flow/cli/src/benchmarks/gaia-agent.tsv3/@claude-flow/cli/src/benchmarks/gaia-judge.tsv3/@claude-flow/cli/src/benchmarks/gaia-loader.tsv3/@claude-flow/cli/src/benchmarks/gaia-tools/index.tsnode v3/@claude-flow/cli/bin/cli.js --version
Validating GAIA benchmark environment...
[PASS] ANTHROPIC_API_KEY set (sk-ant-...abc3)
[PASS] HF_TOKEN set (hf_...xyz9)
[WARN] GOOGLE_AI_API_KEY not set — Gemini routing disabled
[WARN] GOOGLE_CUSTOM_SEARCH_API_KEY not set — web_search using DuckDuckGo fallback
[PASS] TypeScript build clean (0 errors)
[PASS] HF dataset reachable (1 question fetched)
[PASS] Witness manifest valid (Ed25519 verified)
[PASS] All 5 benchmark source files present
[PASS] CLI binary resolves to v3.6.x
2 warnings (use --strict to fail on warnings)
Ready to run /gaia run
process.env first, then attempt
gcloud secrets versions access latest --secret=<name> silently.npx tsc --noEmit in the CLI package directory; capture stderr.node … gaia-bench run --smoke-only --limit=1 --dry-run.--strict is set in which case warnings also cause exit 1.