← heapsort
RESEARCH↑ trending42

Built an political benchmark for LLMs. KIMI K2 can't answer about Taiwan (Obviously). GPT-5.3 refuses 100% of questions when given an opt-out. [P]

Reddit r/MachineLearningΒ·April 16, 2026

A researcher built a benchmark to map LLMs on a 2D political compass using 98 questions, finding that refusal to answer is a political stance. Initial results include GPT-5.3, Claude Opus 4.6, and KIMI K2, with the repository being fully open-source.

Read original β†—