← heapsort-ai

AI Reasoning

20 items

RESEARCH↑ trendingReddit r/MachineLearning·4/13/2026

Thinking Deeper, Not Longer: Depth-Recurrent Transformers for Compositional Generalization [R]

This content discusses a research paper on Depth-Recurrent Transformers, highlighting its findings on compositional and out-of-distribution generalization. It explores how intermediate step supervision can hinder genuine reasoning in AI models, making them overly reliant on statistical heuristics, a concept extended to foundation models and human intuition.

42
RESEARCHarXiv CS.AI·29d ago

When Does Critique Improve AI-Assisted Theoretical Physics? SCALAR: Structured Critic--Actor Loop for Agentic Reasoning

This paper introduces SCALAR (Structured Critic--Actor Loop for AI Reasoning), an Actor--Critic--Judge pipeline applied to theoretical physics problems. It investigates how the interaction between researchers and AI agents affects results in physics reasoning tasks, demonstrating that multi-turn dialogue significantly improves over single-shot attempts.

28
RESEARCHarXiv CS.CL·4/30/2026

CogRAG+: Cognitive-Level Guided Diagnosis and Remediation of Memory and Reasoning Deficiencies in Professional Exam QA

CogRAG+ is a training-free framework designed to diagnose and remediate memory and reasoning deficiencies in large language models for professional exam QA. It decouples and aligns retrieval and reasoning with human cognitive hierarchies, employing Reinforced Retrieval and cognition-stratified Constrained Reasoning to enhance accuracy and consistency.

27
RESEARCHarXiv CS.AI·29d ago

GraphDC: A Divide-and-Conquer Multi-Agent System for Scalable Graph Algorithm Reasoning

This paper introduces GraphDC, a Divide-and-Conquer multi-agent system designed to enhance graph algorithm reasoning in Large Language Models (LLMs). It improves performance by decomposing large graphs into smaller subgraphs for specialized agents, with a master agent integrating the results, leading to better scalability and robustness.

27
RESEARCHarXiv CS.CL·26d ago

Correct Answers from Sound Reasoning: Verifiable Process Supervision for Language Models

This paper proposes Verifiable Process Supervision (VPS), a post-training framework to jointly optimize language model prediction accuracy and reasoning quality. VPS uses supervised fine-tuning to induce a structured reasoning format, evaluating intermediate claims against ground-truth signals with adaptive reward weighting.

27
RESEARCHarXiv CS.AI·27d ago

Do Vision-Language-Models show human-like logical problem-solving capability in point and click puzzle games?

This paper introduces VLATIM, a new benchmark designed to evaluate the human-like logical problem-solving capabilities of Vision-Language Models (VLMs) in point-and-click physics puzzle games. It reveals a significant disparity between reasoning and execution in large proprietary models when solving The Incredible Machine 2.

27