← heapsort-ai

LLMs

720 items

RESEARCHarXiv CS.LG·15d ago

LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs

LLM-AutoSciLab proposes a closed-loop framework for scientific discovery, moving beyond static inference by actively coupling hypothesis generation with experiment selection and mechanism refinement. It iteratively suggests plausible hypotheses, selects informative experiments to distinguish or refine them, and updates its state using the resulting evidence.

27
RESEARCHarXiv CS.CL·15d ago

SLAP: Stratified Loss-based Pruning for On-Policy Data-Efficient Instruction Tuning

This research introduces SLAP, a novel batch-aware data selection framework designed to improve the data efficiency of instruction tuning for LLMs. SLAP optimizes learning by evaluating entire batch compositions, ensuring comprehensive data distribution coverage and maximizing intra-batch diversity to achieve lossless performance with reduced training costs.

27
RESEARCHarXiv CS.CL·7d ago

Adaptive Latent Agentic Reasoning

This research introduces Adaptive Latent Agentic Reasoning (ALAR), a dual-mode framework designed to enhance the efficiency of LLM agents. ALAR uses compact latent reasoning for routine tasks and escalates to explicit chain-of-thought when deeper deliberation is required, leading to comparable or better task accuracy with substantial efficiency gains.

27
RESEARCHarXiv CS.AI·13d ago

Discovery Agents for Real-Time Analytics: Toward Proactive Insight Systems

This research proposes a multi-agent architecture for autonomous insight discovery in real-time data streams, addressing the limitations of reactive analytics systems. It employs a continuous loop of hypothesis generation, analytics compilation, validation, and visualization, leveraging technologies like Kafka, Flink, and large language models.

27
RESEARCHarXiv CS.CL·14d ago

Cultural Value Alignment Via Latent Activation Steering in Large Language Models

This paper proposes a novel framework for evaluating and intervening in cultural value alignment within Large Language Models (LLMs), addressing their often homogenized cultural perspectives. It uses scenario-based behavioral probing and implicit token probabilities to map latent cultural values, also introducing activation steering to shift these alignments without retraining.

27
ARTICLEDEV.to AI·4/9/2026

Choosing Between GPT-5.4 and Claude Sonnet 4.6 in Real Workflows

O artigo compara o desempenho dos modelos GPT-5.4 e Claude Sonnet 4.6 em fluxos de trabalho reais, destacando que, embora 80% das tarefas sejam semelhantes, o GPT-5.4 se sobressai em 20% das situações que exigem raciocínio multi-passos, uso de ferramentas e saídas estruturadas. A análise enfatiza que critérios como consistência, velocidade, custo e adequação ao fluxo de trabalho são mais importantes do que apenas a correção em ambientes de produção.

27