Back to Trl

Notebooks

examples/notebooks/README.md

1.3.05.1 KB
Original Source

Notebooks

This directory contains a collection of Jupyter notebooks that demonstrate how to use the TRL library in different applications.

NotebookDescriptionOpen in Colab
grpo_trl_lora_qlora.ipynbGRPO using QLoRA on free Colab
grpo_agent.ipynbGRPO for agent trainingNot available due to OOM with Colab GPUs
grpo_rnj_1_instruct.ipynbGRPO rnj-1-instruct with QLoRA using TRL on Colab to add reasoning capabilities
sft_ministral3_vl.ipynbSupervised Fine-Tuning (SFT) Ministral 3 with QLoRA using TRL on free Colab
grpo_ministral3_vl.ipynbGRPO Ministral 3 with QLoRA using TRL on free Colab
sft_nemotron_3.ipynbSFT with LoRA on NVIDIA Nemotron 3 models
sft_trl_lora_qlora.ipynbSupervised Fine-Tuning (SFT) using QLoRA on free Colab
sft_qwen_vl.ipynbSupervised Fine-Tuning (SFT) Qwen3-VL with QLoRA using TRL on free Colab
sft_tool_calling.ipynbTeaching tool calling to a model without native tool-calling support using SFT with QLoRA
grpo_qwen3_vl.ipynbGRPO Qwen3-VL with QLoRA using TRL on free Colab

OpenEnv Notebooks

These notebooks demonstrate GRPO training with OpenEnv environments using environment_factory. The BrowserGym notebook uses the lower-level rollout_func API instead.

NotebookDescriptionOpen in Colab
openenv_wordle_grpo.ipynbGRPO to play Wordle on an OpenEnv environment
openenv_sudoku_grpo.ipynbGRPO to play Sudoku on an OpenEnv environment
grpo_functiongemma_browsergym_openenv.ipynbGRPO on FunctionGemma in the BrowserGym environment