RESEARCH29
NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models
arXiv CS.LGΒ·June 1, 2026
This paper introduces NumLeak, a framework designed to measure memorized recall in foundation models using public numeric benchmarks. It reveals that top-tier LLMs recall financial and economic data with high fidelity, suggesting that evaluations may be measuring memorization rather than genuine out-of-sample skill.
Read original β