← heapsort
RESEARCH27

Benchmarking Deflection and Hallucination in Large Vision-Language Models

arXiv CS.CLΒ·April 15, 2026

This paper introduces VLM-DeflectionBench, a new benchmark for Large Vision-Language Models (LVLMs) focusing on deflection and hallucination when dealing with conflicting or insufficient evidence. It also proposes a dynamic data curation pipeline to maintain benchmark difficulty over time and a fine-grained evaluation protocol to disentangle model behavior.

Read original β†—