← heapsort
RESEARCH27

Correct Answers from Sound Reasoning: Verifiable Process Supervision for Language Models

arXiv CS.CLΒ·May 14, 2026

This paper proposes Verifiable Process Supervision (VPS), a post-training framework to jointly optimize language model prediction accuracy and reasoning quality. VPS uses supervised fine-tuning to induce a structured reasoning format, evaluating intermediate claims against ground-truth signals with adaptive reward weighting.

Read original β†—