RESEARCHarXiv CS.LG·8d ago
NumLeak: Public Numeric Benchmarks as Latent Labels in Foundation Models
This paper introduces NumLeak, a framework designed to measure memorized recall in foundation models using public numeric benchmarks. It reveals that top-tier LLMs recall financial and economic data with high fidelity, suggesting that evaluations may be measuring memorization rather than genuine out-of-sample skill.
29