← heapsort-ai

fine-tuning

60 items

RESEARCHarXiv CS.CL·4/17/2026

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

This research proposes TESSY, a Teacher-Student Cooperation Data Synthesis framework, to address performance drops when fine-tuning reasoning models with teacher-generated data. TESSY enables the generation of synthetic sequences that inherit advanced reasoning from the teacher while maintaining stylistic consistency with the student model's distribution.

27
RESEARCHarXiv CS.LG·4/21/2026

Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP

This paper investigates how adaptation methods (Full FT vs. LoRA) and optimization scale jointly shape attention drift and transfer retention in fine-tuned CLIP models. A controlled matched-learning-rate comparison reveals that the learning rate strongly modulates structural change, with Full FT showing marked contraction at higher rates while LoRA remains entropy-positive.

27
RESEARCHarXiv CS.CL·4/21/2026

LiFT: Does Instruction Fine-Tuning Improve In-Context Learning for Longitudinal Modelling by Large Language Models?

LiFT is a new instruction fine-tuning framework designed to improve in-context learning for large language models on longitudinal NLP tasks, which require reasoning over temporally ordered text. It uses a curriculum that progressively increases temporal difficulty, incorporating few-shot structure and temporal conditioning, consistently outperforming base models across various datasets and parameter sizes.

27
RESEARCHarXiv CS.LG·28d ago

Rotation-Preserving Supervised Fine-Tuning

This paper introduces Rotation-Preserving Supervised Fine-Tuning (RPSFT) to improve out-of-domain generalization in large language models while mitigating the degradation caused by standard SFT. RPSFT penalizes changes in projected singular subspaces of pretrained weights, acting as an efficient proxy for Fisher-sensitive directions and outperforming standard SFT baselines.

27
RESEARCHarXiv CS.CL·27d ago

Domain Adaptation of Large Language Models for Polymer-Composite Additive Manufacturing Using Retrieval-Augmented Generation and Fine-Tuning

This study explores strategies for adapting general-purpose large language models (LLMs) to specialized engineering domains, specifically additive manufacturing, to enhance answer accuracy and relevance. It investigates the use of domain-specific fine-tuning and retrieval-augmented generation (RAG) by constructing a curated corpus for evaluation.

27
RESEARCHarXiv CS.LG·16d ago

FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning

This research introduces FuRA (Full-Rank Adaptation), a novel parameter-efficient fine-tuning method that addresses limitations in existing techniques by incorporating spectral preconditioning. By reparameterizing weight matrices via full-rank Singular Value Decomposition and constraining updates, FuRA outperforms unconstrained Full Fine-Tuning while maintaining efficiency.

27