RESEARCH29
ReCrit: Transition-Aware Reinforcement Learning for Scientific Critic Reasoning
arXiv CS.LGΒ·May 20, 2026
ReCrit is a new reinforcement learning framework designed to improve large language models' performance in scientific critic interaction. It addresses the issue of LLMs abandoning correct solutions after user criticism by focusing on inter-turn correctness transitions and categorizing behaviors like correction, sycophancy, and robustness.
Read original β