RESEARCHarXiv CS.LG·14d ago
ARBITER: Reasoning Trajectory Basins and Majority Vote Failures in Test-Time Sampling
When language models use test-time sampling and majority vote, reasoning trajectories concentrate into non-independent
27
When language models use test-time sampling and majority vote, reasoning trajectories concentrate into non-independent