← heapsort
RESEARCH60

Improving Cross-Lingual Factual Recall via Consistency-Driven Reinforcement Learning

arXiv CS.CLΒ·June 8, 2026

This research introduces PolyFact, a multilingual factual QA dataset, to address cross-lingual factual inconsistency in LLMs. It finds that reinforcement learning via GRPO consistently improves cross-lingual factual recall and generalization compared to supervised fine-tuning.

Read original β†—