method_comparison/text_generation_benchmark/README.md
The benchmarking suite uses a separate script, run_base.py, to measure base model inference times and save results for reuse. This should be run once per model configuration to avoid redundant computations and ensure consistent baseline metrics for all PEFT experiments.
Usage:
python run_base.py
This will cache the base model inference results for the specified configuration. Subsequent runs of run.py will automatically load these cached results.
This directory contains a comprehensive benchmarking framework for Parameter-Efficient Fine-Tuning (PEFT) methods. For the task of text generation, the suite measures inference performance, memory usage, and other key metrics across different PEFT configurations.
The benchmarking suite provides:
The suite follows a clean separation between:
This ensures that all experiments are comparable while allowing flexibility in adapter parameters.
# From the peft_bench directory
python run.py experiments/lora/lora_r8 --verbose
The benchmarking suite uses a hierarchical configuration system:
default_benchmark_params.json) - Base configuration shared by all experimentsbenchmark_params.json in each experiment) - Optional overrides for specific experimentsadapter_config.json in each experiment) - PEFT method parametersThis structure ensures consistent comparison while allowing flexibility where needed.
default_benchmark_params.json)Contains shared benchmark settings that apply to all experiments. Here are the key configuration fields:
model_id: The Hugging Face model ID to use as the base model (e.g., "facebook/opt-350m")dtype: Model precision ("float16", "float32", or "bfloat16")seed: Random seed for reproducibilitymax_new_tokens: Maximum number of tokens to generate during inferencenum_inference_runs: Number of inference runs per prompt for statistical reliabilityuse_4bit: Whether to use 4-bit quantization (bool)use_8bit: Whether to use 8-bit quantization (bool)Each experiment can override these settings by providing its own benchmark_params.json file.
Each experiment directory should contain:
adapter_config.json: PEFT adapter configuration. For details on available parameters and their meanings, refer to the PEFT documentation.
(Optional) benchmark_params.json: Override specific benchmark parameters for this experiment.
Example directory structure:
experiments/
└── lora/
├── lora_r8/ # LoRA rank 8 experiment
│ ├── adapter_config.json # PEFT adapter configuration
│ └── benchmark_params.json # Optional benchmark overrides
└── lora_r16/ # LoRA rank 16 experiment
└── adapter_config.json
If an experiment needs different benchmark settings, create benchmark_params.json:
{
"_comment": "Override settings for this specific experiment",
"max_new_tokens": 50,
"num_inference_runs": 15,
"num_prompt_samples": 2
}
These parameters will override the defaults from default_benchmark_params.json. However, the defaults should generally not be changed to keep the results from the individual experiments comparable.
To create a new experiment, follow these steps:
Create the experiment directory
mkdir -p experiments/lora/lora_r8
Generate the adapter configuration programmatically Use the PEFT library to create and save your adapter config:
from peft import LoraConfig
config = LoraConfig(
lora_alpha=16,
lora_dropout=0.1,
r=8,
target_modules=["q_proj", "v_proj"],
task_type="CAUSAL_LM"
)
config.save_pretrained("experiments/lora/lora_r8")
This will create an adapter_config.json in your experiment directory. Adjust parameters as needed for your experiment.
(Optional) Add benchmark overrides
If you need to override default benchmark settings, create a benchmark_params.json in the same directory.
Run the benchmark
python run.py experiments/lora/lora_r8 --verbose
The benchmark automatically runs across all prompt categories for consistent comparison:
Results are tracked separately for each category, allowing analysis of how different PEFT methods perform across varying input lengths.
Results are saved in a structured JSON format with three main sections:
run_infogeneration_infometa_info