← heapsort-ai

large language models

262 items

RESEARCHarXiv CS.LG·26d ago

Multi-Rollout On-Policy Distillation via Peer Successes and Failures

The paper introduces Multi-Rollout On-Policy Distillation (MOPD), a framework that uses a student's local rollout group to construct more informative teacher signals for post-training large language models. MOPD conditions the teacher on both successful and failed peer rollouts, leveraging successes for valid reasoning patterns and failures for avoiding plausible mistakes.

27
RESEARCHarXiv CS.CL·26d ago

TimelineReasoner: Advancing Timeline Summarization with Large Reasoning Models

TimelineReasoner is a novel framework that leverages Large Reasoning Models (LRMs) to advance timeline summarization, moving beyond passive Large Language Model (LLM) generation. It employs a two-stage, reasoning-driven process—Global Cognition and Detail Exploration—to actively extract and refine structured timelines from unstructured online news content.

27
RESEARCHarXiv CS.LG·27d ago

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

Diffusion Language Models (dLLMs) face scalability limits in parallelism due to overly conservative confidence thresholds that hinder their potential for highly parallel processing. This paper introduces LEAP, a training-free, plug-and-play method that improves dLLM parallelism by detecting early-converging tokens, thereby accelerating decoding.

27
RESEARCHarXiv CS.AI·27d ago

Rethinking LLMOps for Fraud and AML: Building a Compliance-Grade LLM Serving Stack

This research paper proposes a specialized LLMOps stack designed for fraud detection and anti-money laundering (AML) compliance, recognizing their distinct serving requirements compared to generic chat workloads. The stack integrates various advanced techniques to efficiently handle evidence-rich, schema-constrained prompts and ensure compliance-grade performance with self-hosted open-weight LLMs.

27
RESEARCHarXiv CS.LG·11d ago

Continuity and Ordinality Matter: Constraining Time Series Tokens for Effective Time Series Analysis with Large Language Models

This paper introduces COM (Continuity and Ordinality Matter), a strategy that integrates geometric constraints into both the initialization and training stages of token-based time series large language models (TS-LLMs). The research demonstrates that preserving continuity and ordinality in time series token embeddings significantly improves model performance and generalizability.

27
RESEARCHarXiv CS.CL·14d ago

TriVAL: A Tri-Validation Framework for Faithful Automatic Optimization Modeling

TriVAL is a novel tri-validation framework designed to enhance the accuracy of automatic optimization modeling by addressing the lack of explicit validation in current methods. It implements a construct-validate-revise loop across semantic specification, mathematical formulation, and code generation stages to mitigate errors and improve overall modeling fidelity.

27
RESEARCHarXiv CS.AI·6d ago

ChatHealthAI: Aligning Electronic Health Record Representations with Large Language Models for Grounded Clinical Reasoning

ChatHealthAI proposes a multimodal framework to align structured electronic health record (EHR) representations with large language models (LLMs). This integration enables clinically grounded natural-language reasoning and accurate patient prediction, bridging the gap between predictive EHR models and interpretable LLM reasoning.

27
RESEARCHarXiv CS.AI·15d ago

PathCal: State-Aware Reflection-Marker Calibration for Efficient Reasoning

This research paper introduces 'PathCal', investigating the distinct functional roles and timing of reflection markers in Large Reasoning Language Models' Chain-of-Thought trajectories. It reveals that markers like 'wait' or 'but' differ significantly in their impact on accuracy and generation length, challenging previous coarse-grained approaches.

27
RESEARCHarXiv CS.CL·8d ago

Configurable Reward Model for Balanced Safety Alignment

This paper introduces the Configurable Safety Reward Model (CSRM) to address the challenge of aligning LLMs with heterogeneous and rapidly evolving safety requirements. CSRM substantially improves generalization to previously unseen safety configurations by being jointly optimized for calibrated safety compliance and reward modeling, achieving state-of-the-art performance on benchmarks.

27
RESEARCHarXiv CS.CL·8d ago

When English Rewrites Local Knowledge: Global Narrative Dominance in Large Language Models

This research paper investigates global narrative dominance in Large Language Models (LLMs), where local cultural knowledge is often overshadowed by global narratives. It introduces the CulturalNB dataset for Bengali cultural contexts and demonstrates that questions asked in English tend to increase global substitution and institutional framing, reducing local perspective coverage.

27
RESEARCHarXiv CS.CL·15d ago

Evaluating Large Language Models in a Complex Hidden Role Game

This research quantifies the deceptive potential of Large Language Models (LLMs) in the social deduction game Secret Hitler, introducing novel metrics and an open-source framework. The study benchmarks LLMs against rule-based algorithms and human games, revealing a gap between conversational ability and strategic depth, and showing that reasoning-enhancement techniques can worsen performance for fascist roles.

27
RESEARCHarXiv CS.CL·12d ago

EvoSpec: Evolving Speculative Decoding via Real-Time Vocabulary and Parameter AdaptationTarget

EvoSpec introduces a framework for real-time evolution of draft models in speculative decoding for Large Language Models, addressing the bottleneck of large vocabulary sizes. It uses dynamic vocabulary and parameter adaptation, employing a context-aware mechanism and a lightweight online alignment strategy to improve acceptance rates and minimize distributional gaps.

27