packages/chip/scripts/alphachip/nebius_h200_runbook.md
Use this when local 16 GB VRAM is too small or too slow. AlphaChip's published Ariane-scale recipe used 8x V100 for training plus many CPU collect workers; a single H200 should be enough for first E1 experiments, but walltime will still depend on how many CPU collect jobs feed Reverb.
ROOT_DIR.git clone <this-repo> e1-chip
cd e1-chip/packages/chip
git clone https://github.com/google-research/circuit_training.git external/circuit_training
git -C external/circuit_training checkout r0.0.4
scripts/alphachip/build_container.sh
For a GPU image:
ALPHACHIP_GPU_IMAGE=1 scripts/alphachip/build_container.sh
USE_GPU=True NUM_COLLECT_JOBS=8 scripts/alphachip/run_toy_training.sh
Once NETLIST_FILE and INIT_PLACEMENT point to converted E1 protobuf/PLC
artifacts:
export ROOT_DIR=/shared/alphachip/e1/run_001
export REVERB_SERVER=<reverb-host-ip>:8008
export NETLIST_FILE=/shared/alphachip/e1/netlist.pb.txt
export INIT_PLACEMENT=/shared/alphachip/e1/initial.plc
Run ppo_reverb_server on the Reverb host, many ppo_collect jobs on CPU
hosts, and train_ppo --use_gpu on the H200 host. Match the upstream
docs/ARIANE.md job split and tune sequence_length to the number of movable
E1 macros or soft macros.