RESEARCHarXiv CS.AI·14d ago
Your Agents Are Aging Too: Agent Lifespan Engineering for Deployed Systems
Long-lived AI agents, increasingly deployed as persistent operational systems, are often evaluated without considering their long-term reliability. This paper introduces AgingBench, a longitudinal reliability benchmark for agent lifespan engineering, measuring degradation and identifying repair targets.
28