fine-tuning

60 items

RESEARCHarXiv CS.CL·4/20/2026

LLM attribution analysis across different fine-tuning strategies and model scales for automated code compliance

This paper analyzes the interpretive behaviors of LLMs for automated code compliance using perturbation-based attribution analysis, comparing different fine-tuning strategies and model scales. Results show full fine-tuning yields more focused attribution patterns, and larger models prioritize specific textual elements like numerical constraints.

model interpretability LLMs Machine learning research fine-tuning

DOCHugging Face Blog·5d ago

How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent

This content provides a guide on how to fine-tune the Nemotron 3.5 Automatic Speech Recognition (ASR) model. It aims to help users adapt the model for specific languages, domains, or accents, optimizing its performance.

learning Nemotron 3.5 AI ASR

RESEARCHarXiv CS.LG·23d ago

TeamTR: Trust-Region Fine-Tuning for Multi-Agent LLM Coordination

This paper introduces TeamTR, a trust-region framework for fine-tuning multi-agent LLM systems, addressing structural failures in sequential fine-tuning. It proves that stale-occupancy evaluation incurs a quadratic penalty with the number of agents and improves performance by 7.1% on average.

Multi-agent LLMs LLM coordination Trust-region method fine-tuning

ARTICLEDEV.to AI·4/22/2026

Why LoRA? Understanding the representative PEFT

LoRA (Low-Rank Adaptation) is introduced as the leading PEFT method, enabling efficient adaptation of massive LLMs like Llama 3 without requiring extensive hardware resources. The post promises to delve into LoRA's mathematical intuition, the concept of "intrinsic dimension," and its game-changing impact for AI engineers.

LLMs deep learning fine-tuning PEFT

RESEARCHarXiv CS.CL·4/20/2026

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Large language models often hallucinate facts, a problem exacerbated by supervised fine-tuning (SFT) which degrades pre-trained knowledge. This research proposes a self-distillation SFT method, inspired by continual learning, to mitigate hallucinations by regularizing output-distribution drift while effectively acquiring new factual information.

hallucinations large language models fine-tuning Continual Learning

RESEARCHarXiv CS.AI·4/17/2026

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification

This work introduces Group Fine-Tuning (GFT), a unified post-training framework for large language models. It addresses intrinsic limitations of supervised fine-tuning (SFT), such as single-path dependency and entropy collapse, through Group Advantage Learning and Dynamic Coefficient Rectification.

LLMs reinforcement learning post-training machine learning

RESEARCHarXiv CS.LG·4/9/2026

TalkLoRA: Communication-Aware Mixture of Low-Rank Adaptation for Large Language Models

TalkLoRA propõe um framework MoELoRA que aborda a instabilidade de roteamento e a dominância de especialistas em métodos existentes, permitindo a comunicação entre especialistas antes do roteamento. Isso é feito através de um Módulo de Conversação leve, que facilita a troca de informações, gerando um sinal de roteamento mais robusto para Large Language Models (LLMs).

LLMs MoE Communication fine-tuning

RESEARCHarXiv CS.LG·4/21/2026

Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning

This research discovers that LoRA fine-tuning leads to 'un-learning' on contested examples, where high annotator disagreement correlates with increased loss during training. This pattern is distinct from full fine-tuning and consistently observed across multiple encoder and decoder-only models and datasets.

model training machine learning NLP fine-tuning

RESEARCHarXiv CS.LG·21d ago

HELLoRA: Hot Experts Layer-Level Low-Rank Adaptation for Mixture-of-Experts Models

HELLoRA proposes a novel method for fine-tuning Mixture-of-Experts (MoE) models by applying Low-Rank Adaptation (LoRA) modules only to the most frequently activated experts at each layer. This technique significantly reduces trainable parameters and improves downstream performance, attributing its success to structured regularization that maintains expert specialization.

LLMs MoE AI fine-tuning

ARTICLEDEV.to AI·4/26/2026

RAG vs Fine-tuning vs AI Agents: Which LLM Architecture to Choose in 2026?

This article analyzes the choice between RAG, fine-tuning, and AI agents for LLM projects, suggesting that a combination is often needed. It provides a practical guide on which architecture to prioritize based on project needs such as data source, actions, and budget.

RAG LLM architectures fine-tuning AI development

RESEARCHarXiv CS.CL·4/21/2026

QU-NLP at QIAS 2026: Multi-Stage QLoRA Fine-Tuning for Arabic Islamic Inheritance Reasoning

This paper presents QU-NLP's multi-stage QLoRA fine-tuning strategy for Arabic Islamic inheritance reasoning using Qwen3-4B. The model achieved a 90% MIR-E score, demonstrating competitive performance with minimal computational resources.

LLMs Legal AI Arabic AI NLP

DOCHugging Face Blog·5/8/2026

MedQA: Fine-Tuning a Clinical AI on AMD ROCm — No CUDA Required

This content details the fine-tuning of a clinical AI model, MedQA, on the AMD ROCm platform. It highlights the ability to perform this task without requiring CUDA, offering a significant alternative for AI development.

GPU hardware-compatibility fine-tuning medical AI

ARTICLEDEV.to AI·4/18/2026

I Thought Fine-Tuning Needed an ML Team. I Was Wrong.

This article highlights how user 'thumbs down' feedback provides invaluable real-world training data for AI systems, often overlooked by teams. It challenges the misconception that AI fine-tuning is always expensive and complex, proposing a simplified feedback loop suitable for product development.

User feedback fine-tuning AI development data collection

ARTICLEDEV.to AI·28d ago

Fine-tuning CLIP on a Niche Domain: How I Got +26pp Accuracy on Architectural Styles and What You Can Apply to Your Own Domain

This article details the process of fine-tuning OpenCLIP ViT-B/32 for architectural styles, achieving a +26 percentage point increase in accuracy. The author focuses on the critical decisions made before and after the training loop that were responsible for this significant result, rather than the training loop optimization itself.

CLIP Vision-Language Models machine learning computer vision

DOCAWS Machine Learning Blog·7d ago

The art and science of hyperparameter optimization on Amazon Nova Forge

This post explores the art and science of hyperparameter optimization on Amazon Nova Forge, detailing how to balance improving domain-specific performance without degrading a model's general capabilities. It covers customization strategies, configuring training parameters like learning rate and batch size, and avoiding common mistakes that lead to wasted training runs.

Amazon Nova Forge hyperparameter optimization learning model training

RESEARCHDEV.to AI·5/7/2026

Post‑training tricks cut LLM cost without losing ability

Recent work demonstrates that post-training tricks can significantly cut LLM cost and memory footprint without losing ability. These include aligning synthetic data with a student's style and utilizing key-value (KV) cache optimizations, achieving substantial savings without typical performance drops.

Optimization cost reduction efficiency fine-tuning

RESEARCHarXiv CS.LG·4/15/2026

Disposition Distillation at Small Scale: A Three-Arc Negative Result

This paper details an attempt to distill behavioral dispositions into small language models (0.6B-2.3B parameters) through a distillation pipeline. Initial reported gains were later falsified due to evaluation artifacts, resulting in a negative outcome for the core hypothesis and leading to three subsequent arcs of investigation.

Negative Results Model Distillation Behavioral Dispositions large language models

RESEARCHarXiv CS.LG·4/28/2026

Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation

This research challenges the assumption that Parameter-Efficient Fine-Tuning (PEFT) equates to memory efficiency for on-device LLMs, showing existing methods can still lead to out-of-memory errors. It introduces LARS (Low-memory Activation-Rank Subspace), a novel framework that decouples memory consumption from sequence length by constraining the activation subspace, achieving an average 33.54% memory footprint reduction.

Memory Optimization on-device AI fine-tuning PEFT

RESEARCHarXiv CS.LG·5/1/2026

Dynamic Adversarial Fine-Tuning Reorganizes Refusal Geometry

This research investigates the training-time mechanisms of refusal in safety-aligned language models, specifically comparing supervised fine-tuning with R2D2-style dynamic adversarial fine-tuning. Findings show R2D2 initially achieves strong refusal on HarmBench but then partially reopens, while SFT remains consistently less robust.

language models model robustness fine-tuning Adversarial Training

RESEARCHarXiv CS.CL·4/9/2026

LLM-Augmented Knowledge Base Construction For Root Cause Analysis

Este estudo avalia metodologias de Large Language Models (LLM) – Fine-Tuning, RAG e uma abordagem Híbrida – para construir uma base de conhecimento de Análise de Causa Raiz (RCA) a partir de tickets de suporte. Os experimentos com um conjunto de dados industrial real demonstram que a base de conhecimento gerada acelera as tarefas de RCA e melhora a resiliência da rede.

RAG knowledge base fine-tuning LLM