doc/source/ray-overview/examples/llamafactory-llm-fine-tune/README.ipynb
This repository provides ready-to-run templates for fine-tuning Large Language Models (LLMs) on Anyscale using LLaMA-Factory. These templates demonstrate instruction tuning and preference alignment at scale (multi-GPU, multi-node), with configurations that you can reuse across different cloud providers.
Each template is an executable notebook that guides you through setup, configuration, and distributed execution. It also includes corresponding YAML/JSON configurations for repeatable runs and automation.
LLaMA-Factory is an easy-to-use, open-source framework. Its simple, declarative configs and consistent CLI allow you to define Continued Pre-Training (CPT), Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Kahneman-Tversky Optimization (KTO) runs once and reuse them across environments. It supports popular adapters like LoRA and QLoRA using Parameter-Efficient Fine-Tuning and integrates with DeepSpeed for efficient multi-GPU training. This enables reproducible, composable workflows that start small and scale on demand.
Supervised instruction tuning with LoRA and DeepSpeed ZeRO for efficient, reproducible multi-GPU training.
Preference alignment on pairwise data with DPO and QLoRA for memory-efficient, scalable training.
Single-signal feedback alignment with KTO and LoRA for lightweight, scalable preference tuning.
Continued pre-training on raw text with full fine-tuning and DeepSpeed ZeRO for efficient, reproducible multi-GPU training.
notebooks/: End-to-end executable templates for SFT, DPO, and KTO.train-configs/: Configuration files for models, adapters, and hyperparameters.dataset-configs/: Dataset metadata and registries that the templates reference.deepspeed-configs/: DeepSpeed ZeRO presets for scaling and memory efficiency.Develop as you would on your laptop. Attach your IDE remotely and install dependencies with pip that automatically propagate to the cluster. Debug distributed training with the distributed debugger. For more details, see Anyscale workspaces.
Transition from development to production by submitting your configurations as an Anyscale job. This allows for reliable execution on managed clusters and seamless integration with CI/CD pipelines. See more Anyscale jobs to learn more.