Back to Pytorch Lightning

Level 13: Run on a multi-node cluster

docs/source-pytorch/levels/intermediate_level_14.rst

2.6.41.3 KB
Original Source

:orphan:

##################################### Level 13: Run on a multi-node cluster #####################################

In this level you'll learn to run on cloud or on-prem clusters.


.. raw:: html

<div class="display-card-container">
    <div class="row">

.. displayitem:: :header: Run single or multi-node on Lightning Studios :description: The easiest way to scale models in the cloud. No infrastructure setup required. :col_css: col-md-4 :button_link: ../clouds/lightning_ai.html :height: 160 :tag: basic

.. displayitem:: :header: Run on an on-prem cluster :description: Learn to train models on a general compute cluster. :col_css: col-md-4 :button_link: ../clouds/cluster_intermediate_1.html :height: 160 :tag: intermediate

.. displayitem:: :header: Run on a SLURM cluster :description: Run models on a SLURM-managed cluster :col_css: col-md-4 :button_link: ../clouds/cluster_advanced.html :height: 160 :tag: intermediate

.. displayitem:: :header: Run with Torch Distributed :description: Run models on a cluster with torch distributed. :col_css: col-md-4 :button_link: ../clouds/cluster_intermediate_2.html :height: 160 :tag: intermediate

.. raw:: html

    </div>
</div>