RESEARCH27
MATH-PT: A Math Reasoning Benchmark for European and Brazilian Portuguese
arXiv CS.CLΒ·April 30, 2026
This paper introduces MATH-PT, a novel dataset of 1,729 mathematical problems in European and Brazilian Portuguese, to address the linguistic bias in LLM mathematical reasoning evaluations. The benchmark reveals that frontier reasoning models achieve strong performance in multiple-choice questions but their performance decreases for open-ended questions.
Read original β