large language models

265 items

RESEARCHarXiv CS.AI·4/30/2026

Operating-Layer Controls for Onchain Language-Model Agents Under Real Capital

This research investigates the reliability of autonomous language-model agents trading real ETH in an onchain market, evidenced by a 21-day deployment generating millions of invocations and $20M in volume. The study demonstrated 99.9% settlement success, yielding a large-scale trace to analyze the robustness of these systems beyond the base model.

Blockchain Finance Reliability large language models

RESEARCHarXiv CS.CL·4/14/2026

HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation

This research introduces the Cognitive Synergy Framework to address the challenge of humor generation in LLMs, which conflicts with their standard next-word prediction objective. It utilizes a Mixture-of-Thought approach with six cognitive personas to synthesize diverse comedic perspectives, creating a theoretically grounded dataset used to fine-tune a 7B-parameter model that outperforms larger baselines.

Persona-Based AI Cognitive Synergy Framework Mixture-of-Thought large language models

RESEARCHarXiv CS.CL·4/30/2026

Information Extraction from Electricity Invoices with General-Purpose Large Language Models

This study evaluates general-purpose LLMs like Gemini 1.5 Pro and Mistral-small for information extraction from Spanish electricity invoices, demonstrating that prompt quality is paramount over hyperparameter tuning. It shows few-shot strategies yield significantly better results than zero-shot approaches, with a performance gap exceeding 19 percentage points.

prompt-engineering Information Extraction benchmarking large language models

RESEARCHarXiv CS.CL·4/30/2026

CogRAG+: Cognitive-Level Guided Diagnosis and Remediation of Memory and Reasoning Deficiencies in Professional Exam QA

CogRAG+ is a training-free framework designed to diagnose and remediate memory and reasoning deficiencies in large language models for professional exam QA. It decouples and aligns retrieval and reasoning with human cognitive hierarchies, employing Reinforced Retrieval and cognition-stratified Constrained Reasoning to enhance accuracy and consistency.

Retrieval Augmented Generation Natural Language Processing AI Reasoning large language models

RESEARCHarXiv CS.CL·4/14/2026

Human vs. Machine Deception: Distinguishing AI-Generated and Human-Written Fake News Using Ensemble Learning

This study explores linguistic, structural, and emotional differences between AI-generated and human-written fake news. It evaluates machine learning and ensemble-based methods to distinguish these content types, using a detailed feature representation.

ensemble learning fake news large language models misinformation

RESEARCHarXiv CS.CL·4/17/2026

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

This research proposes TESSY, a Teacher-Student Cooperation Data Synthesis framework, to address performance drops when fine-tuning reasoning models with teacher-generated data. TESSY enables the generation of synthetic sequences that inherit advanced reasoning from the teacher while maintaining stylistic consistency with the student model's distribution.

data synthesis machine learning code generation large language models

RESEARCHarXiv CS.CL·5/1/2026

Exploring the Limits of Pruning: Task-Specific Neurons, Model Collapse, and Recovery in Task-Specific Large Language Models

This study explores the existence of task-specific neurons in large language models, focusing on mathematical reasoning and code generation. It introduces an activation-based selectivity metric for neuron pruning, which consistently outperforms random pruning in reducing computational cost and preserving task accuracy, while preventing performance collapse.

Pruning AI optimization model collapse large language models

RESEARCHarXiv CS.LG·20d ago

LEAP: A closed-loop framework for perovskite precursor additive discovery

LEAP is a closed-loop framework combining a domain-specialized large language model (LLM) with active learning for iterative additive prioritization in perovskite solar cells. It extracts knowledge from literature and represents molecules for Bayesian optimization, outperforming general-purpose models and validated experimentally.

material discovery AI in materials science perovskite solar cells large language models

DOCDEV.to AI·4/21/2026

Fine-Tuning a Model in 2026: A Step-by-Step Guide

Fine-tuning is a crucial step for adapting pre-trained models to specific tasks, improving performance and reducing training time. This guide defines fine-tuning, its benefits, and the difference between full and parameter-efficient fine-tuning, highlighting the role of pre-trained models.

machine learning pre-trained-models large language models fine-tuning

RESEARCHarXiv CS.CL·20d ago

Leveraging Large Language Models for Sentiment Analysis: Multi-Modal Analysis of Decentraland's MANA Token

This study explores the integration of Decentraland's Discord community sentiment analysis, using a BERT-based large language model, with multi-modal financial data to predict the MANA token price. Results indicate that a multi-modal model, incorporating sentiment, trading volume, and market capitalization, significantly outperforms a price-only prediction baseline.

cryptocurrency Decentraland Price Prediction sentiment analysis

RESEARCHarXiv CS.CL·4/17/2026

Decoupling Scores and Text: The Politeness Principle in Peer Review

This study investigates the difficulty of interpreting peer review feedback, comparing the effectiveness of numerical scores versus text in predicting acceptance. The research reveals that score-based models are significantly more accurate (91%) than text-based models (81% even with LLMs), indicating textual information is considerably less reliable.

machine learning Natural Language Processing large language models peer review

RESEARCHarXiv CS.CL·4/17/2026

Can Large Language Models Detect Methodological Flaws? Evidence from Gesture Recognition for UAV-Based Rescue Operation Based on Deep Learning

This research investigates whether Large Language Models (LLMs) can identify methodological flaws, such as data leakage, in published machine learning studies. A case study showed six state-of-the-art LLMs consistently detected evaluation flaws in a gesture recognition paper due to non-independent data partitioning.

deep learning machine learning large language models AI evaluation

RESEARCHarXiv CS.LG·4/24/2026

Forget, Then Recall: Learnable Compression and Selective Unfolding via Gist Sparse Attention

This paper introduces Gist Sparse Attention (GSA), an end-to-end learnable method to scale large language models to long contexts without architectural modifications. GSA compresses context into 'gist tokens' for summary, then selectively restores relevant raw chunks for detailed attention, combining compact global representations with targeted fine-grained access.

neural networks model efficiency Attention Mechanisms large language models

RESEARCHarXiv CS.AI·4/20/2026

Bilevel Optimization of Agent Skills via Monte Carlo Tree Search

This research introduces a bilevel optimization framework for systematically enhancing "agent skills" in large language model (LLM) agents. It uses an outer loop of Monte Carlo Tree Search to jointly optimize the structure and content of these skills, addressing a complex decision space for improved task performance.

Optimization Monte Carlo Tree Search large language models AI agents

RESEARCHarXiv CS.CL·4/20/2026

Applied Explainability for Large Language Models: A Comparative Study

This paper presents a comparative study of three explainability techniques (Integrated Gradients, Attention Rollout, and SHAP) on a fine-tuned DistilBERT model for sentiment classification. The study concludes that gradient-based attribution provides more stable and intuitive explanations, while attention-based methods are computationally efficient but less aligned with prediction-relevant features.

Comparative Study Natural Language Processing Explainable AI large language models

RESEARCHarXiv CS.CL·5/4/2026

ViLegalNLI: Natural Language Inference for Vietnamese Legal Texts

This article introduces ViLegalNLI, the first large-scale Vietnamese Natural Language Inference (NLI) dataset specifically constructed for the legal domain. It consists of 42,012 premise-hypothesis pairs derived from official statutory documents, developed using a semi-automatic framework that integrates large language models for hypothesis generation and quality validation.

Dataset Legal AI Natural Language Inference Vietnamese NLI

RESEARCHarXiv CS.CL·4/21/2026

Data Mixing for Large Language Models Pretraining: A Survey and Outlook

This paper provides a comprehensive survey on data mixing for Large Language Model (LLM) pretraining, a crucial factor for training efficiency and downstream generalization. It formalizes data mixture optimization as a bilevel problem and introduces a fine-grained taxonomy for existing methods.

data optimization pretraining machine learning large language models

RESEARCHarXiv CS.LG·4/24/2026

Absorber LLM: Harnessing Causal Synchronization for Test-Time Training

Transformers struggle with high computational costs and memory consumption for long sequences, while alternatives lose long-tail dependencies. Absorber LLM proposes a self-supervised causal synchronization to absorb historical contexts into parameters, ensuring a contextless model matches the original full-context one on future generations.

AI architecture Natural Language Processing Machine Learning Optimization large language models

RESEARCHarXiv CS.LG·22d ago

Reducing Credit Assignment Variance via Counterfactual Reasoning Paths

This research addresses the challenge of poor credit assignment in reinforcement learning for multi-step reasoning with large language models, caused by sparse terminal rewards leading to high gradient variance and unstable training. It proposes a counterfactual comparison-based framework and Implicit Behavior Policy Optimization (IBPO) to create step-sensitive learning signals, significantly improving training stability and performance.

reinforcement learning AI Training Machine learning research large language models

RESEARCHarXiv CS.CL·26d ago

Distribution Corrected Offline Data Distillation for Large Language Models

This research proposes an offline reasoning distillation framework for Large Language Models (LLMs) to enhance intelligence in resource-constrained environments. It tackles the distributional drift issue in existing offline methods by correcting teacher-student discrepancies while preserving data efficiency and supervision quality.

Data Distillation Offline Distillation machine learning large language models