method_comparison/README.md
The goal of this project is to provide replicable experiments that produce outcomes allowing us to compare different PEFT methods with one another. This gives you more information to make an informed decision about which methods best fit your use case and what trade-offs to expect.
Visit our Gradio Space to check the results.
We envision the PEFT method comparison project as an ongoing endeavor with heavy involvement from the community. As maintainers, it is impossible for us to know all the perfect hyperparameters for each method or to predict all the use cases that PEFT users may have. As a consequence, community contributions are very welcome.
Below, we outline all the ways you can contribute to this project.
Creating a new experiment requires setting up a new PEFT configuration for us to test. This will result in one more data point being added to the total comparison.
Working on this is especially relevant if:
Of course, you can contribute even without meeting these criteria. Please follow the instructions below.
Start by navigating to one of the existing experiment folders, e.g. peft/method_comparison/MetaMathQA, if your experiment involves using the MetaMathQA dataset. There, create a new directory inside the experiments/<method-name> folder using a descriptive name. For example, if you want to test LoRA with rank 123 using Llama-3.2 3B as the base model, you could name the folder experiments/lora/llama-3.2-3B-rank123.
Inside this directory, you will find a default configuration file called default_training_params.json, which contains the default parameters used in the run.py training script. Create a new JSON file containing all the parameters you want to modify compared to the defaults, and save it as training_params.json in the newly created folder. If you are satisfied with all the default training parameters, you can skip this step.
Finally, you need to create a PEFT configuration file for the PEFT method you want to add. This should be a JSON file called adapter_config.json, placed in the same directory. Below is an example of how this could look:
from peft import LoraConfig
config = LoraConfig(r=123)
config.save_pretrained("experiments/lora/llama-3.2-3B-rank123/")
Once you've created the configuration files for your experiment, please create a PR on PEFT. After it is reviewed and merged, we will run it on our hardware to ensure that the results are comparable. Of course, it is best if you run the experiment at least once on your hardware to verify that the proposed settings work well.
When adding a new experiment, please consider the following points:
We provide a training script that includes features typically useful for improving training outcomes, such as AMP support, a cosine learning rate schedule, etc. However, there is always room for improvement. For example, at the time of writing, the script does not support gradient accumulation. Therefore, PRs that extend the training script are welcome.
Follow the same process as when contributing to PEFT in general (see the contribution guidelines). If the same training script is used across multiple datasets, please ensure that all relevant scripts are updated accordingly.
Trainer class from transformers or PyTorch Lightning. This ensures transparency, making it easier to understand the training process and replicate results over time. If a training framework were used, we would have to pin the version or risk future incompatibilities.Adding a new dataset increases the breadth and usefulness of the PEFT method comparison. The goal is not necessarily to outperform benchmarks or replicate paper results, but to fairly compare different PEFT methods in a way that is useful for PEFT users. If this involves replicating an experiment from a paper, that is great, but it is not a requirement.
The easiest way to add support for a new dataset is to copy an existing setup, such as method_comparison/MetaMathQA, rename it, and modify data.py, as well as any other necessary parts of the code. Ideally, as much existing code as possible should be reused. The general folder structure and experiment logging format should remain consistent.
After adding the dataset, ensure it functions correctly and produces meaningful results by running at least one experimental setup, such as using LoRA with default settings.
For convenience, we included a Gradio app that shows the results of the experiments. It allows you to filter down the task and base model and show the experiment results for this selection. Give it a try here.
This app requires additional packages to be installed, please install the packages listed in requirements-app.txt, e.g. via:
python -m pip install -r requirements-app.txt
To launch the demo, run:
python app.py