RESEARCH27
EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation
arXiv CS.LGΒ·May 7, 2026
This research introduces EdgeRazor, a lightweight framework designed to deploy Large Language Models on resource-constrained devices. It leverages mixed-precision quantization-aware distillation to convert full-precision models into lower-bit formats, overcoming limitations of previous quantization methods.
Read original β