RESEARCHarXiv CS.CL·4/15/2026
Benchmarking Deflection and Hallucination in Large Vision-Language Models
This paper introduces VLM-DeflectionBench, a new benchmark for Large Vision-Language Models (LVLMs) focusing on deflection and hallucination when dealing with conflicting or insufficient evidence. It also proposes a dynamic data curation pipeline to maintain benchmark difficulty over time and a fine-grained evaluation protocol to disentangle model behavior.
27