RESEARCH27
Theory-optimal Quantization Based on Flatness
arXiv CS.LGΒ·May 20, 2026
This research models the relationship between quantization error and outliers in Large Language Models (LLMs) and introduces a new metric, Flatness, to quantify outlier distribution. Based on this, it derives a theoretical optimal solution and proposes Bidirectional Diagonal Quantization (BDQ) for post-training quantization.
Read original β