← heapsort
RESEARCH27

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

arXiv CS.CLΒ·April 17, 2026

This research proposes TESSY, a Teacher-Student Cooperation Data Synthesis framework, to address performance drops when fine-tuning reasoning models with teacher-generated data. TESSY enables the generation of synthetic sequences that inherit advanced reasoning from the teacher while maintaining stylistic consistency with the student model's distribution.

Read original β†—