← heapsort-ai

AI Research

146 items

RESEARCHarXiv CS.LG·29d ago

Path-Based Gradient Boosting for Graph-Level Prediction

We propose PathBoost, a gradient tree boosting method for graph-level classification and regression, which learns discriminative path-based features directly from the input graph structure. This method introduces adaptations for binary classification, incorporates multiple node and edge attributes, and automatically selects anchor nodes, outperforming or matching graph neural networks and graph kernel approaches on several benchmark datasets.

27
RESEARCHarXiv CS.CL·26d ago

Merging Methods for Multilingual Knowledge Editing for Large Language Models: An Empirical Odyssey

This paper investigates the effectiveness of vector merging methods for multilingual knowledge editing (MKE) in Large Language Models, focusing on reducing interference between language-specific edits. Evaluating six merging variants across two LLMs, two editing methods, and 12 languages on the MzsRE benchmark, it finds vector summation with shared covariance to be the most reliable overall strategy.

27
RESEARCHarXiv CS.AI·28d ago

EVOCHAMBER: Test-Time Co-evolution of Multi-Agent System at Individual, Team, and Population Scales

EVOCHAMBER presents a training-free framework that instantiates test-time evolution at three levels over a coevolving agent pool, distinguishing it from single-agent approaches. It features CODREAM, a post-task protocol for collaborative reflection and asymmetric knowledge routing after team failures or disagreements.

27
RESEARCHarXiv CS.CL·27d ago

Bridging the Missing-Modality Gap: Improving Text-Only Calibration of Vision Language Models

Vision-language models (VLMs) experience significant accuracy drops and severe miscalibration when operating with text-only inputs, even with preserved semantic information. The Latent Imagination Module (LIM) is proposed to predict imagined latent embeddings from text, improving accuracy and reducing calibration error in missing-image scenarios.

27
RESEARCHarXiv CS.CL·27d ago

BoostTaxo: Zero-Shot Taxonomy Induction via Boosting-Style Agentic Reasoning and Constraint-Aware Calibration

BoostTaxo introduces a novel boosting-style LLM framework designed for zero-shot taxonomy induction, aiming to overcome limitations in generalization and efficiency of existing methods. It refines taxonomy construction through a coarse-to-fine parent identification process, leveraging retrieval-augmented definition refinement and hybrid candidate selection.

27
RESEARCHarXiv CS.LG·12d ago

Continuity and Ordinality Matter: Constraining Time Series Tokens for Effective Time Series Analysis with Large Language Models

This paper introduces COM (Continuity and Ordinality Matter), a strategy that integrates geometric constraints into both the initialization and training stages of token-based time series large language models (TS-LLMs). The research demonstrates that preserving continuity and ordinality in time series token embeddings significantly improves model performance and generalizability.

27
RESEARCHarXiv CS.AI·16d ago

PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning

This research paper introduces 'PathCal', investigating the distinct functional roles and timing of reflection markers in Large Reasoning Language Models' Chain-of-Thought trajectories. It reveals that markers like 'wait' or 'but' differ significantly in their impact on accuracy and generation length, challenging previous coarse-grained approaches.

27
RESEARCHarXiv CS.CL·16d ago

Graph Alignment Topology as an Inductive Bias for Grounding Detection

Large Language Models (LLMs) are optimized for plausible continuations rather than explicitly verifying if generated propositions are entailed by source documents, limiting their use in critical domains. This research proposes leveraging alignment topology as an inductive bias by constructing aligned bipartite graphs between reference information and LLM outputs, then training a Graph Neural Network (GNN).

27
RESEARCHarXiv CS.CL·7d ago

Linear Probes Detect Task Format, Not Reasoning Mode in Language Model Hidden States

This paper reveals that linear probes, often used to identify distinct reasoning representations in LLM hidden states, actually detect task format rather than reasoning modes. High accuracy observed on benchmarks with Qwen3-14B vanished when controlling for format variables, suggesting largely shared reasoning not functionally linked to hidden state geometry.

27
RESEARCHarXiv CS.AI·16d ago

NeuroNL2LTL: A Neurosymbolic Framework for Natural Language Translation of Linear Temporal Logic

NeuroNL2LTL is a neurosymbolic architecture that unifies learned translation with formal verification to translate natural language into Linear Temporal Logic. It employs verifier-in-the-loop training, where verification outcomes serve as reward signals for reinforcement learning, optimizing for formal correctness.

27