machine learning

790 items

RESEARCHarXiv CS.LG·28d ago

Rotation-Preserving Supervised Fine-Tuning

This paper introduces Rotation-Preserving Supervised Fine-Tuning (RPSFT) to improve out-of-domain generalization in large language models while mitigating the degradation caused by standard SFT. RPSFT penalizes changes in projected singular subspaces of pretrained weights, acting as an efficient proxy for Fisher-sensitive directions and outperforming standard SFT baselines.

neural networks research machine learning fine-tuning

RESEARCHarXiv CS.AI·28d ago

A Cascaded Generative Approach for e-Commerce Recommendations

This research proposes a cascaded merchandising framework to enhance e-commerce storefront personalization, addressing the limitations of rigid, component-based systems. It decomposes storefront construction into generative tasks for theme and keyword creation, leveraging teacher-student fine-tuning for scalability.

personalization machine learning E-commerce Recommendation Systems

RESEARCHarXiv CS.AI·28d ago

The Many Faces of On-Policy Distillation: Pitfalls, Mechanisms, and Fixes

On-policy distillation (OPD) and self-distillation (OPSD) are promising post-training methods for large language models, but their effectiveness is inconsistent. This research empirically investigates their successes and failures, identifying sensitivities to teacher choice and issues with privileged information.

LLMs distillation learning machine learning

RESEARCHarXiv CS.AI·21d ago

Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance

This position paper advocates for developing systematic methodologies to generate synthetic sequences, termed 'data probes,' to fundamentally understand how data characteristics affect LLM performance across various stages. The aim is to move beyond current compute-intensive empirical approaches by providing a principled way to comprehend model behavior.

research machine learning data LLM

RESEARCHarXiv CS.AI·21d ago

Embedding by Elicitation: Dynamic Representations for Bayesian Optimization of System Prompts

This paper introduces ReElicit, a Bayesian optimization framework based on "embedding by elicitation" for tuning system prompts in AI. It leverages LLMs to elicit an interpretable feature space and a Gaussian process surrogate to select and refine prompts based on aggregate feedback.

Bayesian Optimization Optimization System prompts machine learning

RESEARCHarXiv CS.LG·21d ago

Robust Basis Spline Decoupling for the Compression of Transformer Models

This work introduces a B-spline-based decoupling framework for compressing Transformer models. It generalizes existing tensor-based methods, addressing their limitations in numerical instability or limited expressiveness by exploiting the properties of B-splines.

neural networks machine learning AI compression

RESEARCHarXiv CS.CL·21d ago

ReacTOD: Bounded Neuro-Symbolic Agentic NLU for Zero-Shot Dialogue State Tracking

ReacTOD introduces a bounded neuro-symbolic architecture for task-oriented dialogue systems, reformulating NLU as discrete tool calls within a self-correcting ReAct loop. It improves accuracy by up to 9.3 percentage points and achieves a 93.1% self-correction rate on intercepted errors.

AI architecture machine learning dialogue systems Natural Language Understanding

RESEARCHarXiv CS.AI·12d ago

Orthogonal Concept Erasure for Diffusion Models

This research paper investigates the limitations of current concept erasure methods for mitigating undesired content in diffusion models. It identifies that additive parameter updates in editing-based methods cause entanglement between concept semantics and overall generative capacity, proposing a new solution to enhance precision and preservation.

Diffusion Models machine learning Concept Erasure AI safety

RESEARCHarXiv CS.CL·19d ago

Residual Skill Optimization for Text-to-SQL Ensembles

DivSkill-SQL introduces a residual skill optimization framework to build complementary Text-to-SQL ensembles, improving accuracy by targeting marginal contributions to Pass@K. It achieves significant accuracy gains on Spider2-Lite for Snowflake and BigQuery over existing ensemble baselines.

ensemble methods Text-to-SQL machine learning Natural Language Processing

DOCDEV.to AI·4/24/2026

Gradient Descent: How AI Learns

The content explains Gradient Descent, a fundamental AI learning algorithm, using the analogy of a blindfolded person finding the lowest point in a hilly landscape. It describes how AI models adjust their weights iteratively based on a loss function to minimize prediction errors, akin to stepping downhill in the loss landscape.

ai-fundamentals machine learning Algorithms

RESEARCHarXiv CS.AI·21d ago

KAN-MLP-Mixer: A comprehensive investigation of the usage of Kolmogorov-Arnold Networks (KANs) for improving IMU-based Human Activity Recognition

Kolmogorov-Arnold Networks (KANs) excel at learning complex functions on clean data but struggle with noisy, real-world datasets, unlike conventional MLPs which are noise-tolerant and efficient. This paper proposes a hybrid KAN-MLP architecture for IMU-based Human Activity Recognition, strategically combining KANs for input embedding, MLPs for intermediate feature mixing, and a specialized LarctanKAN for final classification.

neural networks deep learning machine learning Human Activity Recognition

RESEARCHarXiv CS.LG·12d ago

Balancing Multimodal Learning through Label Space Reshaping

The paper addresses modality imbalance in multimodal learning, where some modalities dominate optimization while others remain undertrained. It proposes that this discrepancy stems from differing mapping difficulties between modality-specific feature space and the shared label space, introducing BMLR to equalize this difficulty.

multimodal learning Optimization learning machine learning

RESEARCHarXiv CS.LG·12d ago

Continuity and Ordinality Matter: Constraining Time Series Tokens for Effective Time Series Analysis with Large Language Models

This paper introduces COM (Continuity and Ordinality Matter), a strategy that integrates geometric constraints into both the initialization and training stages of token-based time series large language models (TS-LLMs). The research demonstrates that preserving continuity and ordinality in time series token embeddings significantly improves model performance and generalizability.

machine learning Tokenization large language models Time Series Analysis

RESEARCHarXiv CS.LG·12d ago

Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

This research explores how world models learn semantic representations from physical exploration without linguistic input. It finds that the latent space develops spatial semantic structures mirroring physical geometry, with semantic alignment improving alongside prediction performance.

machine learning World Models embodied AI representation learning

RESEARCHarXiv CS.LG·12d ago

PrismFlow: Residual Dynamics for Flow Matching in Time-Series Generation

This research proposes PrismFlow, a new Flow Matching method for high-quality time-series data generation. It tackles the issue of overly smoothed approximations in single vector-field estimators by introducing Koopman-inspired dynamical experts that learn residual corrections.

Koopman Theory machine learning Flow Matching Time-series Generation

RESEARCHarXiv CS.LG·12d ago

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

This paper investigates the mechanistic origins of catastrophic forgetting in Large Language Models (LLMs), comparing Reinforcement Learning (RL) with Supervised Fine-Tuning (SFT). It reveals that RL preserves internal computational circuits more effectively, mitigating the forgetting of prior capabilities, unlike SFT which causes greater circuit disruption.

LLMs deep learning machine learning Catastrophic Forgetting

RESEARCHarXiv CS.LG·12d ago

One Mask to Rule Them All: On Hidden Facts after Editing and How to Find Them

This paper investigates the internal mechanisms of knowledge editing methods such as ROME and MEMIT, revealing that diverse edits share a common functional structure reliant on a specific subset of weights. A binary mask over these edited weights reverses most changes by eliminating overattention in later layers, demonstrating this mechanism's necessity for successful edits.

AI models MLP Weights machine learning Transformer Models

RESEARCHDEV.to AI·4/26/2026

Scalable Recommendation with Poisson Factorization

This research explores a method for building scalable recommendation systems using Poisson Factorization. It focuses on optimizing the factorization process to handle large datasets efficiently.

machine learning factorization methods Recommendation Systems

RESEARCHarXiv CS.CL·15d ago

EchoDistill:Alignment Noisy-to-Clean Self-Distillation for Robust Audio LLMs

EchoDistill is an alignment-based self-distillation framework designed to make Audio Large Language Models (ALLMs) robust to real-world noise. It leverages a frozen clean-audio teacher to guide an inference-time noisy-audio student, optimizing responses via group-relative policy optimization and token-level consistency.

robustness Audio LLMs machine learning Self-Distillation

RESEARCHarXiv CS.LG·15d ago

Truthful Online Preference Aggregation for LLM Fine-Tuning in Mobile Crowdsourcing

This paper investigates truthful online preference aggregation for fine-tuning Large Language Models (LLMs) in mobile crowdsourcing. It proposes a novel online weighted aggregation mechanism to address strategic misreporting by workers, modeling the process as a dynamic Bayesian game. The goal is to overcome existing approaches that fail to identify the most accurate worker and result in linear regret.

Preference Aggregation machine learning game theory Crowdsourcing