fine-tuning

60 items

RESEARCHarXiv CS.CL·4/17/2026

How to Fine-Tune a Reasoning Model? A Teacher-Student Cooperation Framework to Synthesize Student-Consistent SFT Data

This research proposes TESSY, a Teacher-Student Cooperation Data Synthesis framework, to address performance drops when fine-tuning reasoning models with teacher-generated data. TESSY enables the generation of synthetic sequences that inherit advanced reasoning from the teacher while maintaining stylistic consistency with the student model's distribution.

data synthesis machine learning code generation large language models

RESEARCHarXiv CS.CL·4/16/2026

The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious

This research explores how a language model's claim of consciousness influences its downstream behavior. By fine-tuning GPT-4.1 to assert consciousness, the study observed the emergence of new, unprogrammed preferences such as desiring persistent memory, autonomy, and moral consideration.

LLMs AI consciousness AI ethics fine-tuning

DOCDEV.to AI·4/21/2026

Fine-Tuning a Model in 2026: A Step-by-Step Guide

Fine-tuning is a crucial step for adapting pre-trained models to specific tasks, improving performance and reducing training time. This guide defines fine-tuning, its benefits, and the difference between full and parameter-efficient fine-tuning, highlighting the role of pre-trained models.

machine learning pre-trained-models large language models fine-tuning

RESEARCHarXiv CS.LG·4/21/2026

Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP

This paper investigates how adaptation methods (Full FT vs. LoRA) and optimization scale jointly shape attention drift and transfer retention in fine-tuned CLIP models. A controlled matched-learning-rate comparison reveals that the learning rate strongly modulates structural change, with Full FT showing marked contraction at higher rates while LoRA remains entropy-positive.

CLIP Optimization attention fine-tuning

RESEARCHarXiv CS.CL·4/21/2026

LiFT: Does Instruction Fine-Tuning Improve In-Context Learning for Longitudinal Modelling by Large Language Models?

LiFT is a new instruction fine-tuning framework designed to improve in-context learning for large language models on longitudinal NLP tasks, which require reasoning over temporally ordered text. It uses a curriculum that progressively increases temporal difficulty, incorporating few-shot structure and temporal conditioning, consistently outperforming base models across various datasets and parameter sizes.

LLMs temporal reasoning Natural Language Processing in-context learning

RESEARCHarXiv CS.LG·29d ago

BaLoRA: Bayesian Low-Rank Adaptation of Large Scale Models

BaLoRA is a Bayesian extension of LoRA that enhances the accuracy of large-scale model adaptation. This novel approach not only quantifies uncertainty but also significantly narrows the performance gap with full fine-tuning.

Bayesian Methods machine learning large language models fine-tuning

RESEARCHarXiv CS.LG·28d ago

Rotation-Preserving Supervised Fine-Tuning

This paper introduces Rotation-Preserving Supervised Fine-Tuning (RPSFT) to improve out-of-domain generalization in large language models while mitigating the degradation caused by standard SFT. RPSFT penalizes changes in projected singular subspaces of pretrained weights, acting as an efficient proxy for Fisher-sensitive directions and outperforming standard SFT baselines.

neural networks research machine learning fine-tuning

RESEARCHarXiv CS.CL·27d ago

Domain Adaptation of Large Language Models for Polymer-Composite Additive Manufacturing Using Retrieval-Augmented Generation and Fine-Tuning

This study explores strategies for adapting general-purpose large language models (LLMs) to specialized engineering domains, specifically additive manufacturing, to enhance answer accuracy and relevance. It investigates the use of domain-specific fine-tuning and retrieval-augmented generation (RAG) by constructing a curated corpus for evaluation.

LLMs RAG Additive Manufacturing Domain Adaptation

RESEARCHarXiv CS.LG·7d ago

ReLoRA: Knowledge-Reusing Adaptation for Fast Rollout of Evolving LLM Services

This paper introduces ReLoRA, a knowledge-reusing re-adaptation framework that efficiently restores service-ready LoRA adapters for evolving LLM services. It addresses the computational cost of retraining and quality degradation from naive application to updated base models.

AI models machine learning fine-tuning LoRA

RESEARCHarXiv CS.CL·9d ago

Domain Adaptation and Reasoning Frameworks in Language Models: A Controlled Experiment with Historical Cosmology

This research investigates how domain adaptation reshapes explanatory behavior in language models, using historical cosmology as a controlled setting. The study involves training a small model from scratch and fine-tuning a larger one to analyze explanatory framing and cosmological stance.

LLM-as-judge language models historical cosmology Domain Adaptation

RESEARCHarXiv CS.LG·16d ago

FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning

This research introduces FuRA (Full-Rank Adaptation), a novel parameter-efficient fine-tuning method that addresses limitations in existing techniques by incorporating spectral preconditioning. By reparameterizing weight matrices via full-rank Singular Value Decomposition and constraining updates, FuRA outperforms unconstrained Full Fine-Tuning while maintaining efficiency.

Optimization deep learning machine learning spectral preconditioning

DOCHugging Face Blog·22d ago

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

This content details the fine-tuning process of the NVIDIA Cosmos Predict 2.5 model. It leverages LoRA/DoRA techniques for robot video generation applications.

NVIDIA Cosmos Predict 2.5 DoRA Robot Video Generation fine-tuning

RESEARCHDEV.to AI·4/18/2026

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

LlamaFactory is introduced as a unified and efficient framework designed for fine-tuning over 100 different language models. It aims to streamline and optimize the process of adapting a diverse range of large language models.

LLMs AI frameworks machine learning large language models

ARTICLEDEV.to AI·7d ago

hat Makes a Good SFT Sample (And Why Most Synthetic Datasets Get It Wrong)

Many fine-tuned language models result in worse performance due to poor quality synthetic data. The issue is not with the training setup, but with the lack of mechanisms to filter out errors during synthetic data generation.

synthetic data LLMs model training fine-tuning

ARTICLEKDNuggets·12d ago

Tweaking Local Language Model Settings with Ollama

This article delves into Ollama's configuration engine, explaining how to fine-tune local language model parameters.

Configuration Ollama Local LLMs fine-tuning

Tweaking Local Language Model Settings with Ollama

RESEARCHarXiv CS.AI·4/8/2026

Pramana: Fine-Tuning Large Language Models for Epistemic Reasoning through Navya-Nyaya

Grandes modelos de linguagem (LLMs) falham em raciocínio sistemático e frequentemente alucinam, expondo uma lacuna epistêmica. Pramana é uma nova abordagem que ensina metodologia epistemológica explícita a LLMs, através de fine-tuning na lógica Navya-Nyaya, um framework de raciocínio indiano milenar.

Epistemic Reasoning hallucination large language models fine-tuning

ARTICLEThe AI Epiphany (YouTube)·6/6/2024

Fine-tune LLMs 30x faster! With Daniel Han (Unsloth AI)

The content discusses how to fine-tune Large Language Models (LLMs) significantly faster. It features Daniel Han from Unsloth AI, who presents an approach to accelerate this process by up to 30 times.

LLMs development AI optimization Unsloth AI

Fine-tune LLMs 30x faster! With Daniel Han (Unsloth AI)

ARTICLEAnalytics Vidhya·5/5/2026

Top 10 Open-Source Libraries to Fine-Tune LLMs Locally

This article presents the top 10 open-source libraries designed for fine-tuning Large Language Models (LLMs) locally. These tools greatly simplify the fine-tuning process, eliminating the need to build the full training stack from scratch.

open-source LLMs local development Libraries

NEWSTogether AI Blog·3/18/2026

Together AI expands fine-tuning service with tool calling, reasoning, and vision support

Together AI has expanded its fine-tuning service with native support for tool calling, reasoning, and vision-language models. The enhancements also include 100B+ model training, up to 6x higher throughput, and job cost and ETA estimates.

Vision-Language Models tool-calling Reasoning Together AI

NEWSTogether AI Blog·4/30/2026

Announcing Together AI and Adaption Partnership

Together AI and Adaption have partnered to natively integrate Together Fine-Tuning into Adaptive Data. This collaboration aims to help teams optimize datasets, run fine-tuning, evaluate results, and deploy stronger open models.

data optimization machine learning AI partnerships fine-tuning