distributed/tensor_parallelism/README.md
This example demonstrates SPMD Megatron-LM style Tensor Parallel by using PyTorch native Tensor Parallel APIs, which include:
More details about the PyTorch native Tensor Parallel APIs, please see PyTorch docs: https://pytorch.org/docs/stable/distributed.tensor.parallel.html
pip install -r requirements.txt
You can run the examples using torchrun to launch distributed training:
# Simple Tensor Parallel example
torchrun --nnodes=1 --nproc_per_node=4 tensor_parallel_example.py
# Tensor Parallel with Sequence Parallel
torchrun --nnodes=1 --nproc_per_node=4 sequence_parallel_example.py
# FSDP + Tensor Parallel with Llama2 model
torchrun --nnodes=1 --nproc_per_node=4 fsdp_tp_example.py
For more details, check the run_examples.sh script.