scientific AI agents — artículos, noticias e investigación de IA

ARTICLEDEV.to AI·15/4/2026

LABBench2 Benchmark Shows AI Biology Agents Struggle with Real-World Tasks

Investigadores presentaron LABBench2, un nuevo benchmark de 1.900 tareas para IA en biología, revelando que los modelos actuales rinden un 26-46% peor en tareas realistas. Esto expone una brecha crítica entre el conocimiento teórico de la IA y su capacidad para realizar trabajo científico práctico.

LABBench2 AI limitations scientific AI agents AI in biology