RESEARCH27

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

Hugging Face Blog·May 27, 2026

ITBench-AA, the first benchmark for agentic enterprise IT tasks, reveals that current frontier AI models score below 50%. This study by Artificial Analysis and IBM highlights the need for significant advancements in models to effectively handle enterprise IT demands.

Benchmarking IT automation Enterprise AI Frontier models AI agents

Read original ↗