RESEARCH27
Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation
arXiv CS.LGΒ·April 28, 2026
This research challenges the assumption that Parameter-Efficient Fine-Tuning (PEFT) equates to memory efficiency for on-device LLMs, showing existing methods can still lead to out-of-memory errors. It introduces LARS (Low-memory Activation-Rank Subspace), a novel framework that decouples memory consumption from sequence length by constraining the activation subspace, achieving an average 33.54% memory footprint reduction.
Read original β