RESEARCH27

EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation

arXiv CS.LG·May 7, 2026

This research introduces EdgeRazor, a lightweight framework designed to deploy Large Language Models on resource-constrained devices. It leverages mixed-precision quantization-aware distillation to convert full-precision models into lower-bit formats, overcoming limitations of previous quantization methods.

LLMs deep learning quantization model optimization edge computing

Read original ↗