RESEARCH27

KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context

arXiv CS.CL·April 16, 2026

KMMMU is a new native Korean benchmark for evaluating multimodal understanding in Korean cultural and institutional settings, featuring 3,466 questions from native exams. The study shows that current AI models achieve only 42.05% accuracy on the full set, with significant failures in culturally and discipline-specific problems.

language models multimodal AI evaluation Benchmarking

Read original ↗