← heapsort-ai

failure detection

1 items

RESEARCHarXiv CS.CL·22d ago

PQR: A Framework to Generate Diverse and Realistic User Queries that Elicit QA Agent Failures

This paper introduces PQR, a framework designed to generate diverse and realistic user queries that elicit failures in LLM-based QA agents, going beyond existing methods that primarily focus on adversarial users. PQR operates through iterative query and prompt refinement modules to create realistic test scenarios that expose agent vulnerabilities.

28