deep learning

263 items

RESEARCHDEV.to AI·5/6/2026

RAVE: A variational autoencoder for fast and high-quality neural audio synthesis

RAVE introduces a novel variational autoencoder designed for high-quality and fast neural audio synthesis. This model significantly improves the efficiency and fidelity of generating audio content using deep learning techniques.

deep learning audio synthesis neural audio Variational Autoencoder

RESEARCHDEV.to AI·21d ago

Ensemble of Deep Convolutional Neural Networks for Learning to Detect RetinalVessels in Fundus Images

This research paper proposes a method for detecting retinal blood vessels in fundus images using an ensemble of Deep Convolutional Neural Networks. The approach aims to improve diagnostic accuracy through advanced image analysis.

ensemble methods deep learning Convolutional Neural Networks Medical Imaging

RESEARCHDEV.to AI·4/20/2026

Stable Video Infinity: Generating Infinite-Length Videos with Error Recycling

Stable Video Infinity introduces a novel solution for generating infinite-length videos, overcoming the problem of accumulated errors. Its core innovation is a sophisticated Error Recycling mechanism that prevents visual degradation over time.

deep learning machine learning AI video generation

ARTICLEDEV.to AI·28d ago

Multi-Head Attention: Collaborate Instead of Concatenate

This content explores the multi-head attention mechanism in AI models, focusing on the idea of collaboration instead of concatenation. It likely discusses an alternative approach to improve attention efficiency or performance.

deep learning Attention Mechanism machine learning AI

RESEARCHarXiv CS.LG·14d ago

Iterative Refinement Neural Operators are Learned Fixed-Point Solvers: A Principled Approach to Spectral Bias Mitigation

This paper introduces the Iterative Refinement Neural Operator (IRNO) to mitigate spectral bias in neural operators, using a learned refinement module via fixed-point iteration. IRNO decomposes predictions into a coarse initialization followed by successive residual corrections, achieving significant error reduction across physical systems.

deep learning Neural Operators Scientific Computing Iterative Methods

RESEARCHarXiv CS.AI·4d ago

An interpretable and trustworthy AI framework for large-scale longitudinal structure-pain association studies using data from the Osteoarthritis Initiative (OAI)

This research develops an interpretable AI framework combining deep learning-based MRI Osteoarthritis Knee Score (MOAKS) prediction with interpretable statistical modeling to study structure-pain relationships at scale using data from the Osteoarthritis Initiative (OAI). It utilizes deep learning for MOAKS feature prediction from MRIs with uncertainty quantification, followed by a longitudinal latent class mixed model to examine associations between structural abnormalities and knee pain.

deep learning Healthcare Osteoarthritis AI

RESEARCHarXiv CS.CL·4d ago

Multi-Granularity Reasoning for Natural Language Inference

The paper proposes a novel Multi-Granularity Reasoning Network (MGRN) for Natural Language Inference (NLI). It addresses the limitations of existing transformer-based models by leveraging hierarchical semantic features to capture complex interactions for effective reasoning.

deep learning Natural Language Inference machine learning Natural Language Processing

RESEARCHarXiv CS.LG·4/6/2026

From Broad Exploration to Stable Synthesis: Entropy-Guided Optimization for Autoregressive Image Generation

O artigo analisa a interação entre Chain-of-Thought (CoT) e Reinforcement Learning (RL) na geração de imagens a partir de texto (T2I) usando uma análise sistemática baseada em entropia. Ele revela que menor entropia dos tokens de imagem e do CoT textual se correlaciona com melhor qualidade de imagem, propondo a estratégia Entropy-Guided Group Relative Policy Optimization (EG-GRPO) para otimização com base na incerteza.

Optimization deep learning reinforcement learning Text-to-Image Generation

DOCDEV.to AI·4/17/2026

Understanding Transformers Part 9: Stacking Self-Attention Layers

This article explains why self-attention values replace original positional encodings, as they integrate contextual information from all words, clarifying relationships. It then introduces stacking multiple self-attention layers, each with unique weights, to capture more complex linguistic relationships within sentences and paragraphs.

neural networks Self-Attention deep learning NLP

RESEARCHDEV.to AI·4/19/2026

F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models

F-VLM introduces a novel approach for open-vocabulary object detection by efficiently leveraging frozen pre-trained vision and language models. This method allows for identifying a wide range of objects without requiring specific training data for each new category.

Vision-Language Models deep learning object detection computer vision

RESEARCHDEV.to AI·5/2/2026

Deep convolutional recurrent autoencoders for learning low-dimensional featuredynamics of fluid systems

This content discusses the application of deep convolutional recurrent autoencoders to learn low-dimensional feature dynamics in fluid systems.

Dimensionality Reduction fluid dynamics deep learning autoencoders

RESEARCHDEV.to AI·5/7/2026

Stateless scheduler doubles LLM training speed

Fine-tuning large language models often faces bottlenecks from rigid GPU allocation and inefficient pipeline parallelism. A new stateless scheduler, RoundPipe, optimizes training by dynamically dispatching computation stages across a pool of GPUs, effectively doubling LLM training speed.

deep learning machine learning GPU optimization Parallelism

ARTICLEDEV.to AI·27d ago

Comparing AI Approaches for Trade Promotion Strategies in Automotive

The text highlights that "AI trade promotion" encompasses a range of approaches, from rule-based systems to deep learning. Automotive OEMs must evaluate these methodologies based on tradeoffs like accuracy, complexity, and data maturity, much like choosing ADAS sensor configurations.

deep learning automotive machine learning AI

RESEARCHarXiv CS.LG·5/7/2026

Investigating Trustworthiness of Nonparametric Deep Survival Models for Alzheimer's Disease Progression Analysis

This research investigates the trustworthiness and fairness of nonparametric deep survival models for analyzing Alzheimer's Disease (AD) progression. It addresses the lack of studies considering learned bias in existing deep learning models for AD and proposes novel fairness metrics to ensure reliable predictions.

deep learning Alzheimer's disease survival analysis medical AI

RESEARCHarXiv CS.LG·6d ago

Graph Mamba Survival Analysis Based on Topology-Aware ordering

This paper addresses challenges in Whole Slide Images (WSIs) survival analysis, specifically the computational bottleneck of Transformers and Mamba's sensitivity to input order and unidirectional architecture. It proposes a novel approach to overcome Mamba's limitations in capturing topological connectivity and bidirectional spatial structures.

deep learning survival analysis sequence models computational pathology

RESEARCHDEV.to AI·5/9/2026

DeXpression: Deep Convolutional Neural Network for Expression Recognition

DeXpression is a deep convolutional neural network model designed for accurate facial expression recognition. It aims to enhance computer vision systems' ability to interpret human emotions from images.

facial expression recognition deep learning computer vision Convolutional Neural Networks

RESEARCHarXiv CS.CL·4/10/2026

Hybrid CNN-Transformer Architecture for Arabic Speech Emotion Recognition

Este artigo apresenta um sistema de Reconhecimento de Emoção da Fala (SER) em árabe, baseado em uma arquitetura híbrida CNN-Transformer. O modelo combina camadas convolucionais para extração de características espectrais e codificadores Transformer para capturar dependências temporais, alcançando 97,8% de precisão e 0,98 de F1-score macro.

CNN deep learning Transformer machine learning

RESEARCHDEV.to AI·13d ago

MimicMotion: High-Quality Human Motion Video Generation with Confidence-awarePose Guidance

MimicMotion is a research project focused on high-quality human motion video generation. It utilizes confidence-aware pose guidance to enhance the quality of the generated videos.

deep learning pose guidance AI video generation

RESEARCHarXiv CS.LG·4/17/2026

Awakening Dormant Experts:Counterfactual Routing to Mitigate MoE Hallucinations

Mixture-of-Experts (MoE) models are prone to hallucinations, particularly for long-tail knowledge, because static Top-k routing under-prioritizes specialist experts. Counterfactual Routing (CoR) is proposed as a training-free inference framework that uses perturbation analysis and CEI to dynamically shift computational resources, thereby awakening these dormant experts.

neural networks AI hallucinations deep learning Mixture of Experts

RESEARCHarXiv CS.LG·15d ago

Reading Calibrated Uncertainty from Language Model Trajectories

This research paper proposes a new method to quantify uncertainty in language models by tracing the cumulative path of per-layer MLP updates. By extracting eleven scale-invariant geometric features, a sparse linear probe is shown to outperform maximum softmax probability in evaluating uncertainty, especially with baseline miscalibration.

language models deep learning Uncertainty Quantification model calibration