benchmark/multi_gpu/training/README.md
Run benchmark, e.g. assuming you have n NVIDIA GPUs:
python training_benchmark_cuda.py --dataset ogbn-products --model edge_cnn --num-epochs 3 --n_gpus <n>
If you want to run your scripts inside a docker image, you could refer to the dockerfile and the corresponding guide.
If you prefer to run your scripts directly on the bare-metal server. We recommend the installation guidance provided by Intel® Extension for PyTorch. The following are some key steps:
# Install oneCCL package on Ubuntu
sudo apt install -y intel-oneapi-dpcpp-cpp-2024.1=2024.1.0-963 intel-oneapi-mkl-devel=2024.1.0-691 intel-oneapi-ccl-devel=2021.12.0-309
# Install oneccl_bindings_for_pytorch
pip install oneccl_bind_pt==2.1.300+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
# Runtime Dynamic Linking
source /opt/intel/oneapi/setvars.sh
pip install torch==2.1.0.post2 intel-extension-for-pytorch==2.1.30+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/
This guide is helpful for you to launch DDP training on intel GPU.
To Run benchmark, e.g. assuming you have n XPUs:
mpirun -np <n> python training_benchmark_xpu.py --dataset ogbn-products --model edge_cnn --num-epochs 3