research

78 items

RESEARCHarXiv CS.LG·5/1/2026

When Continual Learning Moves to Memory: A Study of Experience Reuse in LLM Agents

This study investigates the role of external memory in LLM agents for continual learning, showing that the stability-plasticity dilemma resurfaces at the memory level due to limited context windows. A (k,v) framework is introduced to disentangle how experience is represented and organized, finding that abstract procedural memories transfer more reliably than detailed trajectories and finer-grained memory organization is beneficial.

research memory AI agents Continual Learning

RESEARCHarXiv CS.CL·5/8/2026

The Cost of Context: Mitigating Textual Bias in Multimodal Retrieval-Augmented Generation

This paper identifies and formalizes

AI models research RAG MLLMs

RESEARCHarXiv CS.LG·5/8/2026

SAT: Sequential Agent Tuning for Coordinator Free Plug and Play Multi-LLM Training with Monotonic Improvement Guarantees

Sequential Agent Tuning (SAT) introduces a coordinator-free training paradigm for teams of smaller, more efficient LLMs, enabling scalable, decentralized updates. This framework provides theoretical guarantees for monotonic improvement by isolating occupancy drift with per-agent KL trust regions.

LLMs research AI training Distributed AI

RESEARCHarXiv CS.CL·22d ago

Exploring Lightweight Large Language Models for Court View Generation

The research explores the capabilities of lightweight Large Language Models (LLMs) in Criminal Court View Generation (CVG) and their impact on charge prediction within Legal AI. It systematically investigates architectural differences, model size, and comparison with Deep Neural Networks, introducing the CVGEvalKit framework for evaluation.

Legal AI research Court View Generation Natural Language Processing

RESEARCHarXiv CS.AI·18d ago

AOP-Wiki EMOD 3.0: Data Model Expansions and Content Evaluation Framework for Using Agentic AI to Improve Integration between AOPs and New Approach Methodologies (NAMs)

This paper introduces AOP-Wiki EMOD 3.0, focusing on data model expansions and a content evaluation framework. It leverages agentic AI to improve the integration between Adverse Outcome Pathways (AOPs) and New Approach Methodologies (NAMs), addressing current limitations in the AOP-Wiki's infrastructure to support continued growth.

Data Models research Toxicology New Approach Methodologies

RESEARCHarXiv CS.AI·5/11/2026

From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms

Large Language Model (LLM)-based agents have reshaped artificial intelligence, yet research on memory mechanisms remains fragmented. This survey proposes a novel evolutionary framework for LLM agent memory mechanisms, formalizing the development process into three stages: Storage, Reflection, and Experience.

Evolutionary framework LLM Agents research Memory mechanisms

RESEARCHarXiv CS.AI·23d ago

NOVA: Fundamental Limits of Knowledge Discovery Through AI

The NOVA framework models AI knowledge discovery as an adaptive sampling process, identifying conditions for genuine knowledge accumulation and common failure modes like contamination and forgetting. It highlights a "contamination trap" where invalid artifacts can accumulate faster than genuine discoveries as easy-to-find knowledge is exhausted, even with small false-positive rates.

research machine learning AI Knowledge Discovery

RESEARCHarXiv CS.LG·28d ago

Rotation-Preserving Supervised Fine-Tuning

This paper introduces Rotation-Preserving Supervised Fine-Tuning (RPSFT) to improve out-of-domain generalization in large language models while mitigating the degradation caused by standard SFT. RPSFT penalizes changes in projected singular subspaces of pretrained weights, acting as an efficient proxy for Fisher-sensitive directions and outperforming standard SFT baselines.

neural networks research machine learning Fine-tuning

RESEARCHarXiv CS.AI·21d ago

Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance

This position paper advocates for developing systematic methodologies to generate synthetic sequences, termed 'data probes,' to fundamentally understand how data characteristics affect LLM performance across various stages. The aim is to move beyond current compute-intensive empirical approaches by providing a principled way to comprehend model behavior.

research machine learning data LLM

RESEARCHarXiv CS.AI·7d ago

Don't Gamble, GAMBLe: An Analytical Framework for AI-Driven Research Systems

This paper introduces GAMBLe, an analytical framework for AI-Driven Research Systems (ADRS). It decomposes ADRS behavior into four parameters and an effective landscape, showing how distinct generator-assessor pairs induce structurally different optimization landscapes.

LLMs research frameworks AI

RESEARCHarXiv CS.LG·15d ago

LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs

LLM-AutoSciLab proposes a closed-loop framework for scientific discovery, moving beyond static inference by actively coupling hypothesis generation with experiment selection and mechanism refinement. It iteratively suggests plausible hypotheses, selects informative experiments to distinguish or refine them, and updates its state using the resulting evidence.

LLMs research active experimentation Scientific Discovery

RESEARCHarXiv CS.LG·16d ago

Latent Cache Flow: Model-to-Model Communication Without Text

Latent Cache Flow (LCF) is introduced as a new method for efficient model-to-model communication, addressing the latency and information loss of text-based LLM agent communication. LCF jointly translates and compresses keys and values, significantly reducing adapter size and transmitting a summary of new information for differing contexts.

research machine learning AI Communication

RESEARCHarXiv CS.AI·14d ago

Experiments in Agentic AI for Science

This paper introduces two novel frameworks for developing autonomous, agentic AI in scientific workflows, leveraging a hybrid Local Body, Remote Brain architecture with LLM cloud backends. The systems, DeepTS/DeepCollector and DeepScribe, automate time-series dataset curation and scientific presentation analysis, demonstrating how agentic AI can overcome context and reasoning limitations.

Scientific AI research LLM applications autonomous agents

DOCDEV.to AI·28d ago

Automate Your Literature Review: A Practical AI Pipeline for Researchers

This content presents a practical AI pipeline for researchers to automate systematic literature reviews, emphasizing the creation of a "gold set" for robust AI training and testing. It also suggests using tools like PythonTutor for debugging data extraction functions.

research learning literature review AI tools

ARTICLEDEV.to AI·14d ago

AI for science is becoming a builder workflow, not a lab demo

The next valuable shift in AI focuses on helping people conduct better investigations, evolving from answering questions to supporting research workflows. This is exemplified by Google's Gemini for Science, highlighting AI tools built around practical research processes. This model is valuable not only for scientists but for anyone who needs to turn messy information into defensible results, encouraging sharper questions and testing assumptions.

Workflows research Gemini for Science science

RESEARCHDEV.to AI·15d ago

Alibaba + Nanjing Univ Claim 9.36X Faster Million-Token Prefill vs FlashAttention-2

Alibaba and Nanjing University researchers claim a 9.36X speedup for million-token prefill in long-context LLM inference, significantly outperforming FlashAttention-2. This breakthrough addresses the dominant latency bottleneck in processing large prompts, where attention computation typically scales quadratically.

FlashAttention research AI performance

DOCDEV.to AI·4/25/2026

Automating Literature Reviews: An AI-Powered Guide for Niche Researchers

This content focuses on automating literature reviews for researchers, addressing the bottleneck of manual PDF screening. It proposes an iterative refinement loop and introduces the open-source library GROBID for extracting structured data from academic documents.

GROBID research literature review AI application

ARTICLEOpenAI Blog·29d ago

What Parameter Golf taught us about AI-assisted research

Parameter Golf brought together over 1,000 participants and 2,000 submissions to explore AI-assisted machine learning research. The event focused on coding agents, quantization, and novel model design under strict constraints.

research machine learning quantization AI

RESEARCHarXiv CS.CL·4/6/2026

Speaking of Language: Reflections on Metalanguage Research in NLP

Este trabalho define metalinguagem e explora sua conexão com PNL e LLMs, discutindo esforços de pesquisa e dimensões de tarefas metalinguísticas. Propõe ainda uma lista de futuras direções de pesquisa pouco estudadas.

LLMs research Metalanguage NLP

RESEARCHarXiv CS.CL·5/6/2026

Geometric Deviation as an Unsupervised Pre-Generation Reliability Signal: Probing LLM Representations for Answerability

This research explores using geometric deviation of LLM hidden states as a pre-generation signal to determine if a query is outside the model's knowledge, requiring no labeled failure data. Across various models and prompt forms, it finds that this signal effectively predicts unanswerable math prompts but not factual ones.

LLMs research Model Evaluation Reliability