examples/notebooks/README.md
This directory contains a collection of Jupyter notebooks that demonstrate how to use the TRL library in different applications.
| Notebook | Description | Open in Colab |
|---|---|---|
grpo_trl_lora_qlora.ipynb | GRPO using QLoRA on free Colab | |
grpo_agent.ipynb | GRPO for agent training | Not available due to OOM with Colab GPUs |
grpo_rnj_1_instruct.ipynb | GRPO rnj-1-instruct with QLoRA using TRL on Colab to add reasoning capabilities | |
sft_ministral3_vl.ipynb | Supervised Fine-Tuning (SFT) Ministral 3 with QLoRA using TRL on free Colab | |
grpo_ministral3_vl.ipynb | GRPO Ministral 3 with QLoRA using TRL on free Colab | |
sft_nemotron_3.ipynb | SFT with LoRA on NVIDIA Nemotron 3 models | |
sft_trl_lora_qlora.ipynb | Supervised Fine-Tuning (SFT) using QLoRA on free Colab | |
sft_qwen_vl.ipynb | Supervised Fine-Tuning (SFT) Qwen3-VL with QLoRA using TRL on free Colab | |
sft_tool_calling.ipynb | Teaching tool calling to a model without native tool-calling support using SFT with QLoRA | |
grpo_qwen3_vl.ipynb | GRPO Qwen3-VL with QLoRA using TRL on free Colab |
These notebooks demonstrate GRPO training with OpenEnv environments using environment_factory. The BrowserGym notebook uses the lower-level rollout_func API instead.
| Notebook | Description | Open in Colab |
|---|---|---|
openenv_wordle_grpo.ipynb | GRPO to play Wordle on an OpenEnv environment | |
openenv_sudoku_grpo.ipynb | GRPO to play Sudoku on an OpenEnv environment | |
grpo_functiongemma_browsergym_openenv.ipynb | GRPO on FunctionGemma in the BrowserGym environment |