← heapsort
RESEARCH28

LABBench2: An Improved Benchmark for AI Systems Performing Biology Research

arXiv CS.AIΒ·April 14, 2026

LABBench2 is introduced as an improved benchmark for evaluating AI systems performing biology research, evolving from the original LAB-Bench. It aims to measure real-world capabilities in useful scientific tasks, moving beyond basic knowledge and reasoning, and comprises nearly 1,900 tasks.

Read original β†—