Training
QLoRA
Quick Answer
A variant of LoRA that adds quantization for even more parameter-efficient fine-tuning.
QLoRA combines LoRA with quantization for maximum parameter efficiency. The base model is quantized to lower precision (int8 or int4), and LoRA adapts it. QLoRA enables fine-tuning large models on consumer GPUs. It trades some quality for dramatic efficiency gains. QLoRA is practical for fine-tuning 70B parameter models on single GPUs. Quantization can cause slight quality degradation but LoRA compensates well. QLoRA is increasingly popular for accessible fine-tuning.
Last verified: 2026-04-08