← heapsort-ai

Reasoning

57 items

RESEARCHarXiv CS.CL·26d ago

TimelineReasoner: Advancing Timeline Summarization with Large Reasoning Models

TimelineReasoner is a novel framework that leverages Large Reasoning Models (LRMs) to advance timeline summarization, moving beyond passive Large Language Model (LLM) generation. It employs a two-stage, reasoning-driven process—Global Cognition and Detail Exploration—to actively extract and refine structured timelines from unstructured online news content.

27
RESEARCHarXiv CS.CL·20d ago

Diagnosing Multi-step Reasoning Failures in Black-box LLMs via Stepwise Confidence Attribution

This paper introduces Stepwise Confidence Attribution (SCA), a framework for closed-source LLMs that diagnoses multi-step reasoning failures by assigning step-level confidence. SCA applies the Information Bottleneck principle, flagging deviations from consensus structures as potential errors, and proposes two complementary methods: NIBS and GIBS.

27
RESEARCHarXiv CS.AI·15d ago

PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning

This research paper introduces 'PathCal', investigating the distinct functional roles and timing of reflection markers in Large Reasoning Language Models' Chain-of-Thought trajectories. It reveals that markers like 'wait' or 'but' differ significantly in their impact on accuracy and generation length, challenging previous coarse-grained approaches.

27
RESEARCHarXiv CS.CL·8d ago

Can LLM Teams Play What? Where? When?

This research explores how team-based interactions improve Large Language Model (LLM) performance on complex reasoning tasks, specifically in the quiz game What? Where? When?. It demonstrates that team strategies yield significant accuracy gains, with the best teams approaching human performance.

27
RESEARCHarXiv CS.CL·6d ago

Adaptive Latent Agentic Reasoning

This research introduces Adaptive Latent Agentic Reasoning (ALAR), a dual-mode framework designed to enhance the efficiency of LLM agents. ALAR uses compact latent reasoning for routine tasks and escalates to explicit chain-of-thought when deeper deliberation is required, leading to comparable or better task accuracy with substantial efficiency gains.

27
RESEARCHarXiv CS.AI·4/9/2026

SELFDOUBT: Uncertainty Quantification for Reasoning LLMs via the Hedge-to-Verify Ratio

Este artigo propõe SELFDOUBT, uma estrutura de passagem única para quantificar a incerteza em LLMs de raciocínio, especialmente para APIs proprietárias. Utiliza o Hedge-to-Verify Ratio (HVR) para identificar marcadores de incerteza e autoavaliação diretamente do rastro de raciocínio, superando métodos caros de amostragem.

27
RESEARCHarXiv CS.AI·4/30/2026

Auto-Relational Reasoning

Researchers propose a novel theoretical framework for automated relational reasoning, integrating Machine Learning with rigid reasoning to surpass the limitations of current large models. The resulting system demonstrates high performance on IQ problems, achieving a 98.03% solving rate without prior knowledge.

27