RESEARCH27

Calibrated Preference Learning: The Case of Label Ranking

arXiv CS.LG·June 1, 2026

This paper formalizes calibration for probabilistic label ranking, introducing a hierarchy of notions for full, sub-ranking, and top-k calibration. Empirically, popular label ranking models are often poorly calibrated, with implications for RLHF reward models.

Calibration AI models ranking machine learning RLHF

Read original ↗