RESEARCH27

CL-bench Life: Can Language Models Learn from Real-Life Context?

arXiv CS.CL·May 1, 2026

CL-bench Life is a new human-curated benchmark designed to assess whether frontier language models can effectively learn from complex, messy real-life contexts. It comprises 405 context-task pairs to test models' ability to reason over personal and social experiences.

context-learning language models benchmarks

Read original ↗