Bookmark: LoRA: Low-Rank Adaptation of Large Language Models

lqdev👽06/01/2023

LoRA reduces the number of trainable parameters by learning pairs of rank-decompostion matrices while freezing the original weights. This vastly reduces the storage requirement for large language models adapted to specific tasks and enables efficient task-switching during deployment all without introducing inference latency. LoRA also outperforms several other adaptation methods including adapter, prefix-tuning, and fine-tuning.

Paper

Permalink: /feed/microsoft-lora-llm/

Tags: #ai #finetune #llm

Back to feed

Send me a message or webmention