← heapsort
RESEARCH27

CL-bench Life: Can Language Models Learn from Real-Life Context?

arXiv CS.CLΒ·May 1, 2026

CL-bench Life is a new human-curated benchmark designed to assess whether frontier language models can effectively learn from complex, messy real-life contexts. It comprises 405 context-task pairs to test models' ability to reason over personal and social experiences.

Read original β†—