← heapsort-ai

model adaptation

4 items

RESEARCHarXiv CS.LG·5/1/2026

Simple Self-Conditioning Adaptation for Masked Diffusion Models

Masked diffusion models (MDMs) discard clean-state predictions for tokens that remain masked, limiting cross-step refinement. This paper proposes Self-Conditioned Masked Diffusion Models (SCMDM), a post-training adaptation that conditions each denoising step on the model's own previous clean-state predictions. This enhances performance without significant architectural changes or extra evaluations.

27
RESEARCHarXiv CS.CL·4/27/2026

Where Should LoRA Go? Component-Type Placement in Hybrid Language Models

This research systematically investigates LoRA placement in hybrid language models, which combine attention and recurrent components. It finds that adapting the attention pathway consistently outperforms full-model adaptation with significantly fewer parameters, while the effect of adapting the recurrent backbone varies drastically depending on the hybrid architecture (sequential vs. parallel).

27
RESEARCHarXiv CS.CL·4/27/2026

Knowledge-driven Augmentation and Retrieval for Integrative Temporal Adaptation

KARITA (Knowledge-driven Augmentation and Retrieval for Integrative Temporal Adaptation) is a system developed to address the challenges of temporal shifts in AI models, which are trained on historical data but deployed on future data. It integrates knowledge-driven augmentation and retrieval to capture diverse shifts and leverage insights for improved temporal adaptation across multiple domains.

27