← heapsort-ai

large language models

262 items

RESEARCHarXiv CS.LG·4/16/2026

Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization

This paper introduces STOMP, a novel offline reinforcement learning algorithm for multi-objective optimization using smooth Tchebysheff scalarization. It addresses the limitation of linear scalarization in recovering non-convex Pareto fronts, crucial for aligning large language models and other real-world applications with conflicting rewards.

31
RESEARCHarXiv CS.AI·5d ago

Thinking Through Signs: PEEL as a Semiotic Scaffolding for Epistemically Accountable AI-Enabled Research

This commentary introduces PEEL, a working scaffolding combining deterministic distant reading with LLM interpretation, grounded in Peircean semiotics and abductive reasoning. Applied to AI-generated condensations, PEEL reveals systematic distortions invisible without non-AI measurement, implying deterministic instruments must accompany AI tools to ensure fidelity and epistemic authority.

31
ARTICLEDEV.to AI·3d ago

<think>

This content focuses on comparing the costs of various AI models, highlighting cheaper alternatives to GPT-4o. It explores significant savings by using models like GPT-4o-mini, DeepSeek V4 Flash, and Qwen3-32B, which can be up to 40 times more cost-effective.

30
ARTICLEDEV.to AI·3d ago

<think>

This article details an indie hacker's discovery of substantial cost savings by leveraging alternative AI models via the Global API, comparing their pricing against GPT-4o. It highlights how developers can reduce expenses for large language model inference using a wide range of available options.

30
RESEARCHarXiv CS.CL·5d ago

Cross-Prompt Generalization in Detecting AI-Generated Fake News Using Interpretable Linguistic Features

This study investigates cross-prompt generalization in detecting AI-generated fake news using interpretable linguistic features like lexical diversity and readability. A random forest classifier achieved consistently high performance (AUC 0.988-1.000) across various train-test combinations, demonstrating robustness against different prompting strategies.

29
ARTICLEDEV.to AI·4/11/2026

Why Your pip Install Output Doesn't Belong in Claude's Context

Este artigo discute como o output detalhado do comando `pip install` é desnecessário e prejudicial para o contexto de modelos de IA como o Claude, que precisam apenas saber se a instalação de pacotes Python foi bem-sucedida ou falhou. Detalhes verbosos como barras de progresso e logs de compilação são considerados ruído que não auxilia a IA na depuração.

29
RESEARCHarXiv CS.LG·4/16/2026

Design Conditions for Intra-Group Learning of Sequence-Level Rewards: Token Gradient Cancellation

This paper presents a necessary condition for intra-group learning algorithm design in Reinforcement Learning, requiring objectives to maintain gradient exchangeability across token updates to prevent reward-irrelevant drift. It proposes minimal transformations to restore this cancellation structure, which stabilizes training and improves sample efficiency.

29
RESEARCHarXiv CS.LG·5/7/2026

Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search

This paper introduces Structured Progressive Knowledge Activation (SPARK) to address the challenge of integrating architectural knowledge in LLM-driven Neural Architecture Search (NAS). SPARK mitigates "functional entanglement" by enabling factor-conditioned editing, leading to more targeted and reliable architecture modifications.

29
RESEARCHarXiv CS.LG·4/22/2026

Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling

This work addresses the challenge of missing modalities in multimodal clinical data for diagnosis by reframing it as an autoregressive sequence modeling task. It leverages causal decoders from LLMs and a missingness-aware contrastive pre-training to outperform baselines on benchmarks like MIMIC-IV and eICU.

29
RESEARCHarXiv CS.LG·4/28/2026

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

This work addresses the significant memory footprint of Key-Value (KV) caching in transformer language models, proposing optimization through the depth dimension. It introduces a method for cross-layer cache sharing, demonstrating that dropping a layer's cache can be efficient without information loss, and suggests a training approach with random cross-layer attention.

29
RESEARCHarXiv CS.CL·4/13/2026

Drift and selection in LLM text ecosystems

This paper introduces a mathematical framework to analyze the recursive process where AI-generated text re-enters and shapes the public record from which LLMs learn. It distinguishes between "drift," which removes rare forms through unfiltered reuse, and "selection," which filters content based on criteria like quality, showing normative selection preserves deeper linguistic structures.

29