deep learning

263 items

RESEARCHDEV.to AI·4d ago

Aligning where to see and what to tell: image caption with region-basedattention and scene factorization

This work introduces a method for image caption generation, utilizing region-based attention and scene factorization to enhance descriptive relevance and accuracy. It aims to more effectively align visual perception with textual narration.

scene understanding deep learning computer vision attention mechanisms

ARTICLEDEV.to AI·18d ago

Understanding Transformer Architecture in 2026 (SilentRecon Deep Dive)

The "SilentRecon Deep Dive" article explores Transformer architecture, explaining how it surpassed RNNs and LSTMs by enabling parallel processing and attention. This resulted in scalability, faster training, deeper contextual understanding, and real-time inference, making them the default intelligence layer for cybersecurity and automation.

Transformer Architecture cybersecurity deep learning learning

RESEARCHarXiv CS.AI·20d ago

Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency

This paper introduces Learn-by-Wire Guard (LBW-Guard), a bounded autonomous training-control governance layer for language models. It aims to improve training stability and efficiency by observing telemetry and applying bounded control, significantly reducing final perplexity.

language models deep learning AI training model stability

ARTICLEDEV.to AI·4/18/2026

Statistics after the loss of innocence: New rigor in the age of AI

This article analyzes the evolution of statistics in the age of AI, advocating for a shift to hybrid systems and treating statistics as an engineering discipline. It highlights the importance of safeguarding clinical trials, robust computational infrastructure, and new regulatory guidelines like ICH E20 to manage risks.

regulatory compliance deep learning AI risk management

RESEARCHDEV.to AI·4/15/2026

Alzheimer's Disease Diagnostics by a Deeply Supervised Adaptable 3DConvolutional Network

This content presents a methodology for diagnosing Alzheimer's Disease using a deeply supervised and adaptable 3D Convolutional Network. The research explores the use of advanced deep learning to improve accuracy in medical image diagnostics.

deep learning Convolutional Neural Networks 3D CNN AI

ARTICLEDEV.to AI·27d ago

Lambda — Deep Dive

Lambda is a specialized AI infrastructure provider focused on GPU compute and machine learning tooling, carving a critical niche in the AI hardware landscape. Unlike generalist hyperscalers, the company's mission is to enable seamless transitions from prototypes to massive production workloads for its diverse customer base.

GPU compute deep learning cloud computing machine learning

RESEARCHarXiv CS.LG·4/28/2026

Stochastic KV Routing: Enabling Adaptive Depth-Wise Cache Sharing

This work addresses the significant memory footprint of Key-Value (KV) caching in transformer language models, proposing optimization through the depth dimension. It introduces a method for cross-layer cache sharing, demonstrating that dropping a layer's cache can be efficient without information loss, and suggests a training approach with random cross-layer attention.

deep learning Memory Optimization large language models Transformers

RESEARCHarXiv CS.LG·4/28/2026

The Spectral Lifecycle of Transformer Training: Transient Compression Waves, Persistent Spectral Gradients, and the Q/K--V Asymmetry

This systematic study of singular value spectra during transformer pretraining reveals three key phenomena: transient compression waves propagating through layers and persistent spectral gradients. It also identifies a Q/K–V functional asymmetry, where query/key projections drive depth-dependent dynamics while value/output projections compress uniformly.

neural networks deep learning Model Analysis training dynamics

RESEARCHarXiv CS.LG·5/1/2026

Cross-Subject Generalization for EEG Decoding: A Survey of Deep Learning Methods

This survey reviews deep learning methods for cross-subject EEG decoding, addressing the challenge of high inter-subject variability and domain shift. It categorizes current literature into methodological families like feature alignment and contrastive learning, emphasizing rigorous evaluation and theoretical considerations.

Generalization deep learning Biomedical AI EEG

RESEARCHarXiv CS.LG·19d ago

Geometry-Lite: Interpretable Safety Probing via Layer-Wise Margin Geometry

Geometry-Lite is a novel prompt-level probe designed to interpret how safety evidence develops across layers in large language models. It analyzes layer-wise margin geometry using various readouts to understand boundary formation, improving safety detection over single-layer probes.

deep learning Probing interpretability large language models

RESEARCHarXiv CS.LG·17d ago

Don't Collapse Your Features: Why CenterLoss Hurts OOD Detection and Multi-Scale Mahalanobis Wins

This research introduces GOEN, a novel pipeline for out-of-distribution (OOD) detection, which effectively combines multi-scale features and Mahalanobis distance. It reveals that CenterLoss surprisingly degrades OOD detection performance, with GOEN-NoCenterLoss achieving state-of-the-art results.

OOD Detection Epistemic Uncertainty Feature Engineering deep learning

RESEARCHarXiv CS.LG·20d ago

Dimensional Balance Improves Large Scale Spatiotemporal Prediction Performance

This paper proposes a scalable, adaptive framework to improve spatiotemporal prediction by harmonizing spatial and temporal feature representations. It addresses bottlenecks in existing methods through spatial and temporal entropy measures to tackle complexity mismatch and prediction uncertainty.

model performance deep learning spatiotemporal prediction machine learning

RESEARCHarXiv CS.LG·27d ago

$\xi$-DPO: Direct Preference Optimization via Ratio Reward Margin

This paper introduces -DPO, a direct preference optimization method using a ratio reward margin, to address the challenge of hyperparameter tuning in SimPO. The research analyzes SimPO and reformulates the preference objective to improve interpretability across datasets with varying reward gap structures.

Preference Optimization deep learning reinforcement learning Hyperparameter Tuning

RESEARCHarXiv CS.LG·27d ago

Interpretable EEG Microstate Discovery via Variational Deep Embedding: A Systematic Architecture Search with Multi-Quadrant Evaluation

This paper introduces the Convolutional Variational Deep Embedding (Conv-VaDE) model for EEG microstate analysis. It enhances interpretability by jointly learning topographic reconstruction and probabilistic soft clustering, enabling generative decoding of cluster prototypes into verifiable scalp topographies.

deep learning machine learning Neuroscience medical AI

RESEARCHarXiv CS.LG·20d ago

Simply Stabilizing the Loop via Fully Looped Transformer

Looped Transformers provide a way to improve model performance by iteratively reusing blocks without increasing parameter count, but they suffer from training instability at higher loop iterations. This instability is attributed to gradient oscillation and residual explosion, leading to the proposal of the Fully Looped Transformer, which introduces a Fully Looped Architecture and Attention Injection.

neural networks AI architecture deep learning model training

DOCAWS Machine Learning Blog·6d ago

Reducing container cold start times using SOCI index on DLAMI and DLC

This post demonstrates how to utilize the SOCI index on publicly available Deep Learning AMIs and Containers to reduce cold start times. It covers the various SOCI modes and provides guidance on efficiently implementing this tool in current workloads.

Containers SOCI deep learning cloud computing

RESEARCHarXiv CS.LG·5d ago

LiftQuant: Continuous Bit-Width LLM via Dimensional Lifting and Projection

LiftQuant is a novel framework for continuous bit-width control in Large Language Models, addressing limitations of integer-based quantization. It employs a "lift-then-project" mechanism to achieve quasi-continuous bit-width tuning for optimal deployment.

Model Compression neural networks LLMs deep learning

DOCDEV.to AI·4d ago

<think>

This content details the Global API service, offering access to 184 AI models with competitive pricing, such as DeepSeek V4 Flash at $0.25/M and GPT-4o. It highlights features like a 99.9% SLA, 50 free requests per minute, and never-expiring credits, alongside Pro Channel options for advanced needs.

AI models deep learning cloud services API

RESEARCHDEV.to AI·4/10/2026

Deep Reinforcement Learning for Sepsis Treatment

Este conteúdo aborda a aplicação de Aprendizado por Reforço Profundo para o tratamento de sepse, uma condição médica grave. Ele explora como técnicas avançadas de IA podem otimizar decisões terapêuticas em ambientes clínicos complexos.

Medical Treatment deep learning reinforcement learning Sepsis

RESEARCHDEV.to AI·4/8/2026

An All-in-One Network for Dehazing and Beyond

Este conteúdo explora uma rede neural unificada projetada para remover neblina de imagens e potencialmente realizar outras tarefas de processamento de imagem. Aborda soluções avançadas em visão computacional e inteligência artificial.

Image processing deep learning computer vision Dehazing