Why PEFT?
- We only train a small part of the LLM (while LoRA still needs to train a large number of parameters).
- Essential when applying LLMs to a specific industry: medical, healthcare, etc.
Some PEFT Techniques
- LoRA: (most popular). Idea of LoRA (paper by Microsoft).
- LoRA will add to the pre-trained weights some low-rank matrices. And instead of having to train the weights, we only train these low-rank matrices.
- Advantages:
- Number of parameters to be train is only a % of the original model (e.g. 10bil -> 10mil)
- Given the gain of computation over a small reduction in performance, this is the most popular PEFT.
- Issues:
- At a high level, the low-rank matrices are usually chosen separately. This happens to make the optimization of LoRA flat and difficult to get optimal settings. This paper proposes a way to share the structure between these low-rank matrices (called “parameterization” or RepLoRA).
- Prompt Tuning: paper from Goolge Research
- We attach the prompts into input data and train only these prompts.
- Advantage:
- Require less compute resources compared to LoRA.
- Support the pretrained models learn more efficiently.
- Easly adapt to LLMs
- Issues:
- The prompts can be too simple that don’t reflect the input data for learning.
- Solution: Adaptive Prompt: Unlocking the Power of Visual Prompt Tuning
- LLaMA-Adapter: