← heapsort-ai

Large Language Models

264 items

RESEARCHarXiv CS.CL·4/20/2026

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Large language models often hallucinate facts, a problem exacerbated by supervised fine-tuning (SFT) which degrades pre-trained knowledge. This research proposes a self-distillation SFT method, inspired by continual learning, to mitigate hallucinations by regularizing output-distribution drift while effectively acquiring new factual information.

27
RESEARCHarXiv CS.AI·4/16/2026

ReSS: Learning Reasoning Models for Tabular Data Prediction via Symbolic Scaffold

ReSS is a framework that bridges symbolic and neural reasoning models for tabular data prediction, aiming for both high accuracy and understandable reasoning. It leverages decision trees to extract symbolic scaffolds that guide an LLM to generate natural-language reasoning, which is then used to fine-tune specialized tabular reasoning LLMs.

27
RESEARCHarXiv CS.AI·4/13/2026

StaRPO: Stability-Augmented Reinforcement Policy Optimization

StaRPO is a novel reinforcement learning framework designed to improve the logical consistency and structural coherence of large language models in complex reasoning tasks. It explicitly incorporates stability metrics, such as Autocorrelation Function and Path Efficiency, to evaluate local step-to-step coherence and global goal-directedness of the reasoning process.

27
RESEARCHarXiv CS.CL·4/7/2026

SoLA: Leveraging Soft Activation Sparsity and Low-Rank Decomposition for Large Language Model Compression

SoLA é um novo método de compressão sem treinamento para LLMs, que utiliza esparsidade de ativação suave e decomposição de baixo-rank. Ele identifica componentes cruciais para a inferência e comprime a maioria, visando reduzir parâmetros de modelos de linguagem grandes de forma eficiente e acessível.

27
RESEARCHarXiv CS.AI·5/1/2026

Think it, Run it: Autonomous ML pipeline generation via self-healing multi-agent AI

This paper proposes a unified multi-agent AI architecture to automate end-to-end machine learning (ML) pipeline generation from datasets and natural-language goals. The five-agent system integrates RAG, an explainable hybrid recommender, and an LLM-based self-healing mechanism, achieving an 84.7% success rate and improved robustness.

27