examples/cartridge_self_study/README.md
This folder shows an example workflow for training a CARTRIDGE adapter via a SELF‑STUDY‑style
context-distillation objective (see the Cartridges paper).
PEFT intentionally keeps this training logic out of the core library; treat this as a starting point you can adapt.
pip install -r requirements.txt
synthesize.py: generates synthetic QA pairs about a corpus using vLLM with prefix caching.train_distill.py: trains a CARTRIDGE adapter via self-study distillation.arxiv_synthesize.py: like synthesize.py, with defaults for the Cartridges paper LaTeX.arxiv_train.py: like train_distill.py, with arxiv-specific defaults.python synthesize.py \
--model Qwen/Qwen3-4B \
--corpus_path /path/to/document.txt \
--out_jsonl distill.jsonl \
--num_samples 1024 \
--use_vllm
With --use_vllm, the document is cached and reused across all samples via automatic prefix caching.
python train_distill.py \
--model Qwen/Qwen3-4B \
--document /path/to/document.txt \
--distill_jsonl distill.jsonl \
--output_dir cartridge_adapter \
--num_virtual_tokens 256 \
--num_frozen_tokens 1 \
--max_steps 500
If you want to follow the arXiv paper example locally, you can use the LaTeX source included in this repo at
examples/cartridge_self_study/data/cartridges.tex (download it first):
mkdir -p examples/cartridge_self_study/data
curl -L -o examples/cartridge_self_study/data/cartridges.tex \
https://raw.githubusercontent.com/HazyResearch/cartridges/refs/heads/main/examples/arxiv/cartridges.tex
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-4B")
model = PeftModel.from_pretrained(model, "cartridge_adapter")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B")
inputs = tokenizer("What is the document about?", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Convenience wrappers for training on the Cartridges paper LaTeX:
# From the repo root:
# Synthesize QA pairs (uses vLLM with prefix caching)
python examples/cartridge_self_study/arxiv_synthesize.py \
--model Qwen/Qwen3-4B \
--corpus_path examples/cartridge_self_study/data/cartridges.tex \
--num_samples 1024 \
--use_vllm
# Train cartridge
python examples/cartridge_self_study/arxiv_train.py \
--model Qwen/Qwen3-4B \
--document examples/cartridge_self_study/data/cartridges.tex \
--distill_jsonl distill.jsonl \
--output_dir cartridge_adapter \
--max_steps 500