RESEARCH27
Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning
arXiv CS.CLΒ·May 15, 2026
This paper audits multimodal-physics evaluation pipelines, uncovering construction practices that distort how vision-language reasoning is measured. It addresses train-eval contamination, translation drift, and MCQ saturation, releasing new artifacts to tackle these gaps.
Read original β