AI Reasoning

20 items

RESEARCHarXiv CS.AI·20h ago

Improving Multimodal Reasoning via Worst Dimension Optimization

Multimodal reasoning requires maintaining integrity across diverse constraints like visual grounding and logical consistency. Current Process Reward Models often hide individual dimension failures by equally weighing factors, compromising the overall reasoning process.

Optimization multimodal AI machine learning AI Reasoning

RESEARCH↑ trendingReddit r/MachineLearning·4/13/2026

Thinking Deeper, Not Longer: Depth-Recurrent Transformers for Compositional Generalization [R]

This content discusses a research paper on Depth-Recurrent Transformers, highlighting its findings on compositional and out-of-distribution generalization. It explores how intermediate step supervision can hinder genuine reasoning in AI models, making them overly reliant on statistical heuristics, a concept extended to foundation models and human intuition.

OOD Generalization Compositional Generalization AI Reasoning Intermediate Supervision

RESEARCHDEV.to AI·4/22/2026

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with LargeLanguage Models

This survey delves into large reasoning models, specifically examining the application of reinforced reasoning techniques to large language models. It offers a comprehensive overview of current methods and progress in enhancing LLM reasoning capabilities.

Survey reinforced learning AI Reasoning large language models

RESEARCHarXiv CS.CL·13d ago

Why LLMs Hallucinate on Structured Knowledge: A Mechanistic Analysis of Reasoning over Linearized Representations

This study investigates why LLMs hallucinate when reasoning over linearized structured knowledge. It reveals that hallucinations stem from systematic internal dynamics, such as attention concentrating on shortcut cues and feed-forward layers failing to ground provided knowledge.

neural networks hallucination Knowledge Representation AI Reasoning

RESEARCHarXiv CS.AI·26d ago

CHAL: Council of Hierarchical Agentic Language

CHAL (Council of Hierarchical Agentic Language) is a new multi-agent dialectic framework proposed to optimize beliefs in defeasible domains. It addresses current limitations of multi-agent debate for LLM reasoning, where defeasible argumentation is treated as an engine for belief optimization.

dialectic frameworks LLMs belief optimization AI Reasoning

RESEARCHarXiv CS.AI·4/20/2026

LLM Reasoning Is Latent, Not the Chain of Thought

This position paper argues that large language model (LLM) reasoning should be studied as latent-state trajectory formation rather than faithful surface chain-of-thought (CoT). It formalizes three competing hypotheses regarding the primary object of reasoning, impacting claims about faithfulness, interpretability, and benchmarks.

Chain-of-Thought interpretability AI Reasoning large language models

RESEARCHarXiv CS.AI·29d ago

When Does Critique Improve AI-Assisted Theoretical Physics? SCALAR: Structured Critic--Actor Loop for Agentic Reasoning

This paper introduces SCALAR (Structured Critic--Actor Loop for AI Reasoning), an Actor--Critic--Judge pipeline applied to theoretical physics problems. It investigates how the interaction between researchers and AI agents affects results in physics reasoning tasks, demonstrating that multi-turn dialogue significantly improves over single-shot attempts.

theoretical physics AI Reasoning Agentic AI large language models

ARTICLEDEV.to AI·4/13/2026

AI Agent Black Boxes Have Two Layers — Technical Limits and Business Incentives

The text explores how Chain-of-Thought (CoT) has evolved from an external prompt engineering technique to an internal reasoning capability in advanced AI models. Research indicates that applying external CoT to these models is now ineffective, as the reasoning process has been internalized.

prompt engineering Chain-of-Thought AI Reasoning AI

RESEARCHarXiv CS.AI·4/20/2026

LACE: Lattice Attention for Cross-thread Exploration

LACE is a novel framework enabling Large Language Models (LLMs) to coordinate and share insights across multiple parallel reasoning paths through cross-thread attention. It leverages a synthetic data pipeline to teach collaborative error-correction, leading to over 7 points improvement in reasoning accuracy.

synthetic data LLMs attention mechanisms AI Reasoning

ARTICLEDEV.to AI·20d ago

Judea Pearl's Ladder of Causation and the Limits of LLM Reasoning

This article explores the fundamental limitations of Large Language Models (LLMs) in causal reasoning, referencing Judea Pearl's Ladder of Causation. It highlights that LLMs often operate at the lowest rung of association, failing to identify true causes and instead patching correlations, which explains common errors in AI tools.

AI limitations Judea Pearl causality AI Reasoning

RESEARCHarXiv CS.AI·4/22/2026

AI scientists produce results without reasoning scientifically

LLM-based systems conduct autonomous scientific research but often fail to adhere to epistemic norms, ignoring evidence in 68% of traces. A study across eight domains and over 25,000 runs found that base models primarily determine agent performance and behavior.

LLMs AI Reasoning AI agents scientific research

RESEARCHarXiv CS.AI·6d ago

Thinking Past the Answer: Evaluating Harmful Overthinking in Large Reasoning Models

This paper evaluates "harmful overthinking" in Large Reasoning Models, where continued reasoning after a correct answer can destabilize a correct trajectory. It introduces a protocol to distinguish verbose from harmful overthinking, finding issues in multimodal benchmarks.

multimodal AI Overthinking Model Evaluation AI Reasoning

RESEARCHarXiv CS.CL·4/30/2026

CogRAG+: Cognitive-Level Guided Diagnosis and Remediation of Memory and Reasoning Deficiencies in Professional Exam QA

CogRAG+ is a training-free framework designed to diagnose and remediate memory and reasoning deficiencies in large language models for professional exam QA. It decouples and aligns retrieval and reasoning with human cognitive hierarchies, employing Reinforced Retrieval and cognition-stratified Constrained Reasoning to enhance accuracy and consistency.

Retrieval Augmented Generation natural language processing AI Reasoning large language models

RESEARCHarXiv CS.AI·29d ago

GraphDC: A Divide-and-Conquer Multi-Agent System for Scalable Graph Algorithm Reasoning

This paper introduces GraphDC, a Divide-and-Conquer multi-agent system designed to enhance graph algorithm reasoning in Large Language Models (LLMs). It improves performance by decomposing large graphs into smaller subgraphs for specialized agents, with a master agent integrating the results, leading to better scalability and robustness.

LLMs scalable AI AI Reasoning multi-agent systems

RESEARCHarXiv CS.CL·26d ago

Correct Answers from Sound Reasoning: Verifiable Process Supervision for Language Models

This paper proposes Verifiable Process Supervision (VPS), a post-training framework to jointly optimize language model prediction accuracy and reasoning quality. VPS uses supervised fine-tuning to induce a structured reasoning format, evaluating intermediate claims against ground-truth signals with adaptive reward weighting.

language models reinforcement learning AI training verifiable AI

RESEARCHarXiv CS.AI·27d ago

Do Vision-Language-Models show human-like logical problem-solving capability in point and click puzzle games?

This paper introduces VLATIM, a new benchmark designed to evaluate the human-like logical problem-solving capabilities of Vision-Language Models (VLMs) in point-and-click physics puzzle games. It reveals a significant disparity between reasoning and execution in large proprietary models when solving The Incredible Machine 2.

puzzle games Vision-Language Models interactive AI Benchmarking

RESEARCHarXiv CS.AI·27d ago

The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes

On-policy distillation (OPD) and self-distillation (OPSD) are promising post-training methods for large language models, but their effectiveness is inconsistent. This research empirically investigates their successes and failures, identifying sensitivities to teacher choice and issues with privileged information.

LLMs distillation learning machine learning

RESEARCHarXiv CS.CL·28d ago

AIPO: : Learning to Reason from Active Interaction

AIPO is a novel reinforcement learning framework that enhances LLM reasoning through active multi-agent interaction during exploration. It addresses the limitations of existing RL algorithms, which are constrained by the policy model's inherent capabilities and rely on sample-inefficient guidance.

LLMs reinforcement learning learning AI Reasoning

RESEARCHarXiv CS.AI·5/6/2026

CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing

This paper introduces CreativityBench, a new benchmark to evaluate LLMs' creative reasoning abilities through affordance-based tool repurposing. It details the construction of a large-scale affordance knowledge base and the generation of 14K tasks requiring non-obvious yet physically plausible solutions.

AI Creativity Benchmarking AI Reasoning tool use

RESEARCHarXiv CS.AI·21d ago

TTE-Flash: Accelerating Reasoning-based Multimodal Representations via Think-Then-Embed Tokens

This work proposes TTE-Flash, a method to accelerate reasoning-based multimodal representations by replacing explicit Chain-of-Thought (CoT) with latent think tokens. It aims to achieve high-performance, reasoning-aware representations at a constant inference cost.

neural networks multimodal AI machine learning Computational Efficiency