← heapsort-ai

deep learning

263 items

RESEARCHarXiv CS.LG·8d ago

DAStatFormer: A Hybrid Multibranch Transformer with Statistical Feature Integration for DAS-Based Pattern Recognitions

DAStatFormer is a hybrid multibranch Transformer proposed to overcome the challenges of high dimensionality and complex spatio-temporal patterns in Distributed Acoustic Sensing (DAS). It integrates compact statistical features from multiple domains, significantly reducing data size and enhancing event classification.

28
RESEARCHarXiv CS.LG·6d ago

Self-Distilled Policy Gradient

This paper introduces Self-Distilled Policy Gradient (SDPG), a novel framework that enhances sparse-reward reinforcement learning through on-policy self-distillation. SDPG integrates group-relative verifier advantages, exact full-vocabulary self-distillation, and KL regularization, demonstrating improved stability and performance over existing baselines.

28
ARTICLEDEV.to AI·4/22/2026

Why LoRA? Understanding the representative PEFT

LoRA (Low-Rank Adaptation) is introduced as the leading PEFT method, enabling efficient adaptation of massive LLMs like Llama 3 without requiring extensive hardware resources. The post promises to delve into LoRA's mathematical intuition, the concept of "intrinsic dimension," and its game-changing impact for AI engineers.

27
RESEARCHarXiv CS.LG·4/20/2026

Lightweight Geometric Adaptation for Training Physics-Informed Neural Networks

Physics-Informed Neural Networks (PINNs) often suffer from slow convergence and instability due to complex loss landscapes. This paper proposes a lightweight, curvature-aware optimization framework that augments existing first-order optimizers to improve convergence speed, training stability, and solution accuracy on partial differential equations (PDEs).

27
RESEARCHarXiv CS.AI·4/25/2026

HypEHR: Hyperbolic Modeling of Electronic Health Records for Efficient Question Answering

HypEHR is a compact Lorentzian model utilizing hyperbolic geometry to address Electronic Health Record (EHR) question answering, overcoming cost and hierarchical structure challenges of LLM-based methods. It is pretrained for next-visit diagnosis prediction and alignment with medical ontologies, achieving LLM-comparable performance with significantly fewer parameters.

27
RESEARCHarXiv CS.LG·4/9/2026

A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset

Este artigo apresenta o AgriPriceBD, um novo conjunto de dados diário de preços de commodities agrícolas de Bangladesh, extraído com auxílio de LLM. Ele avalia sete abordagens de previsão, incluindo modelos clássicos e arquiteturas de deep learning, para estabilização da renda e segurança alimentar.

27
RESEARCHarXiv CS.LG·8d ago

Automatically Differentiable Nonlinear Tensor Networks (ADNTNs) for Exponential Compression of Deep Neural Networks

This paper introduces Automatically Differentiable Nonlinear Tensor Networks (ADNTNs), a family of structured weight generators for exponential compression of Deep Neural Networks. It extends low-rank adaptation and tensor factorization by building large weight tensors through a hierarchy of small cores and nonlinear activations.

27