← heapsort-ai

cultural reasoning

1 items

RESEARCHarXiv CS.CL·5/4/2026

Cultural Benchmarking of LLMs in Standard and Dialectal Arabic Dialogues

New research addresses the gap in evaluating cultural reasoning in LLMs, introducing ArabCulture-Dialogue, a culturally grounded conversational dataset covering 13 Arabic-speaking countries. Experiments indicate that models perform worse on cultural reasoning, translation, and generation tasks in dialectal setups compared to Modern Standard Arabic.

27