← heapsort-ai

interactive AI

5 items

RESEARCHarXiv CS.AI·5/9/2026

BALAR : A Bayesian Agentic Loop for Active Reasoning

This paper introduces BALAR (Bayesian Agentic Loop for Active Reasoning), a task-agnostic outer-loop algorithm enabling structured multi-turn interaction between an LLM agent and a user. BALAR maintains a structured belief over latent states, selects clarifying questions by maximizing expected mutual information, and significantly outperforms baselines across diverse reasoning benchmarks.

27
RESEARCHarXiv CS.AI·27d ago

Do Vision-Language-Models show human-like logical problem-solving capability in point and click puzzle games?

This paper introduces VLATIM, a new benchmark designed to evaluate the human-like logical problem-solving capabilities of Vision-Language Models (VLMs) in point-and-click physics puzzle games. It reveals a significant disparity between reasoning and execution in large proprietary models when solving The Incredible Machine 2.

27