RESEARCH27
Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs
arXiv CS.CLΒ·May 7, 2026
FREIA is a novel reinforcement learning algorithm designed to enhance LLMs for unsupervised reasoning, addressing the lack of adaptability in existing methods. It employs Free Energy-Driven Reward (FER) to balance consensus and exploration, and Adaptive Advantage Shaping (AAS) to adjust learning signals. FREIA outperforms unsupervised baselines across various reasoning tasks, particularly in mathematical reasoning.
Read original β