RESEARCH60
Improving Cross-Lingual Factual Recall via Consistency-Driven Reinforcement Learning
arXiv CS.CLΒ·June 8, 2026
This research introduces PolyFact, a multilingual factual QA dataset, to address cross-lingual factual inconsistency in LLMs. It finds that reinforcement learning via GRPO consistently improves cross-lingual factual recall and generalization compared to supervised fine-tuning.
Read original β