← heapsort-ai

machine learning

790 items

RESEARCHarXiv CS.LG·28d ago

Rotation-Preserving Supervised Fine-Tuning

This paper introduces Rotation-Preserving Supervised Fine-Tuning (RPSFT) to improve out-of-domain generalization in large language models while mitigating the degradation caused by standard SFT. RPSFT penalizes changes in projected singular subspaces of pretrained weights, acting as an efficient proxy for Fisher-sensitive directions and outperforming standard SFT baselines.

27
RESEARCHarXiv CS.AI·21d ago

Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance

This position paper advocates for developing systematic methodologies to generate synthetic sequences, termed 'data probes,' to fundamentally understand how data characteristics affect LLM performance across various stages. The aim is to move beyond current compute-intensive empirical approaches by providing a principled way to comprehend model behavior.

27
DOCDEV.to AI·4/24/2026

Gradient Descent: How AI Learns

The content explains Gradient Descent, a fundamental AI learning algorithm, using the analogy of a blindfolded person finding the lowest point in a hilly landscape. It describes how AI models adjust their weights iteratively based on a loss function to minimize prediction errors, akin to stepping downhill in the loss landscape.

27
RESEARCHarXiv CS.AI·21d ago

KAN-MLP-Mixer: A comprehensive investigation of the usage of Kolmogorov-Arnold Networks (KANs) for improving IMU-based Human Activity Recognition

Kolmogorov-Arnold Networks (KANs) excel at learning complex functions on clean data but struggle with noisy, real-world datasets, unlike conventional MLPs which are noise-tolerant and efficient. This paper proposes a hybrid KAN-MLP architecture for IMU-based Human Activity Recognition, strategically combining KANs for input embedding, MLPs for intermediate feature mixing, and a specialized LarctanKAN for final classification.

27
RESEARCHarXiv CS.LG·12d ago

Continuity and Ordinality Matter: Constraining Time Series Tokens for Effective Time Series Analysis with Large Language Models

This paper introduces COM (Continuity and Ordinality Matter), a strategy that integrates geometric constraints into both the initialization and training stages of token-based time series large language models (TS-LLMs). The research demonstrates that preserving continuity and ordinality in time series token embeddings significantly improves model performance and generalizability.

27
RESEARCHarXiv CS.LG·12d ago

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

This paper investigates the mechanistic origins of catastrophic forgetting in Large Language Models (LLMs), comparing Reinforcement Learning (RL) with Supervised Fine-Tuning (SFT). It reveals that RL preserves internal computational circuits more effectively, mitigating the forgetting of prior capabilities, unlike SFT which causes greater circuit disruption.

27
RESEARCHarXiv CS.LG·12d ago

One Mask to Rule Them All: On Hidden Facts after Editing and How to Find Them

This paper investigates the internal mechanisms of knowledge editing methods such as ROME and MEMIT, revealing that diverse edits share a common functional structure reliant on a specific subset of weights. A binary mask over these edited weights reverses most changes by eliminating overattention in later layers, demonstrating this mechanism's necessity for successful edits.

27
RESEARCHarXiv CS.LG·15d ago

Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

This paper investigates truthful online preference aggregation for fine-tuning Large Language Models (LLMs) in mobile crowdsourcing. It proposes a novel online weighted aggregation mechanism to address strategic misreporting by workers, modeling the process as a dynamic Bayesian game. The goal is to overcome existing approaches that fail to identify the most accurate worker and result in linear regret.

27