← heapsort-ai

deep learning

263 items

RESEARCHarXiv CS.LG·11d ago

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

This paper investigates the mechanistic origins of catastrophic forgetting in Large Language Models (LLMs), comparing Reinforcement Learning (RL) with Supervised Fine-Tuning (SFT). It reveals that RL preserves internal computational circuits more effectively, mitigating the forgetting of prior capabilities, unlike SFT which causes greater circuit disruption.

27
RESEARCHarXiv CS.LG·8d ago

Gait2Hip-60: A Unified Deep Learning Benchmark for Predicting Hip Muscle Forces and Joint Moments from Multi-Cadence Gait Kinematics

This study introduces Gait2Hip-60, a deep learning framework to predict hip muscle forces and joint moments directly from multi-cadence gait kinematics. It compares LSTM, Transformer, and Mamba models, evaluating their performance on healthy adults and an external cohort of patients.

27
RESEARCHarXiv CS.LG·6d ago

Geometry-Aware Tabular Diffusion

Geometry-Aware Tabular Diffusion (GATD) is introduced to improve tabular synthesis by augmenting denoisers with pairwise angles and lengths computed from column value differences. It achieves state-of-the-art performance with fewer parameters, reducing Shape and Trend error, and showing that explicit relational supervision drives the gains.

27
RESEARCHarXiv CS.LG·8d ago

Unicorn: Scaling High-Dimensional Time Series Forecasting via Universal Correlation Modeling

Unicorn is a new framework for scalable, high-dimensional time series forecasting, bridging the gap between channel-independent and channel-dependent models. It leverages a latent prototype codebook to learn universal correlation patterns, significantly outperforming state-of-the-art architectures, especially in few-shot transfer scenarios.

27
RESEARCHarXiv CS.LG·15d ago

FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning

This research introduces FuRA (Full-Rank Adaptation), a novel parameter-efficient fine-tuning method that addresses limitations in existing techniques by incorporating spectral preconditioning. By reparameterizing weight matrices via full-rank Singular Value Decomposition and constraining updates, FuRA outperforms unconstrained Full Fine-Tuning while maintaining efficiency.

27
RESEARCHarXiv CS.LG·12d ago

A Simple State Space Model Excels at Multivariate Time Series Classification

This research systematically studies structured state space models (SSMs) for time-series classification, comparing complex Mamba-based architectures with simpler diagonal SSMs (S4D). Surprisingly, S4D consistently outperforms Mamba variants in accuracy and efficiency on large-scale benchmarks, challenging the assumption that increased model complexity leads to better performance in this domain.

27
RESEARCHarXiv CS.LG·12d ago

Comparative Analysis of Liquid Neural Networks and LSTM for Sequential Pattern Recognition: Robustness, Efficiency, and Clinical Utility

Liquid Neural Networks (LNNs) model hidden state evolution as a continuous differential equation, addressing the limitations of discrete-time RNNs and LSTMs in capturing fluid temporal dynamics. This paper benchmarks LNNs against LSTMs across four sequential modalities, revealing LNNs' superior parameter efficiency and robustness, especially in native temporal domains and clinical environments.

27
RESEARCHDEV.to AI·4/27/2026

Review of Deep Learning

This content is an in-depth review of Deep Learning, exploring its fundamentals and advancements. It offers a comprehensive analysis of the techniques and applications within this field of artificial intelligence.

27