RESEARCH27
GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification
arXiv CS.AIΒ·April 17, 2026
This work introduces Group Fine-Tuning (GFT), a unified post-training framework for large language models. It addresses intrinsic limitations of supervised fine-tuning (SFT), such as single-path dependency and entropy collapse, through Group Advantage Learning and Dynamic Coefficient Rectification.
Read original β