machine learning

790 items

RESEARCHarXiv CS.AI·13d ago

Why LLMs Fail at Causal Discovery and How Interventional Agents Escape

This research paper reveals that large language models fundamentally fail at causal discovery due to their inability to distinguish between causal graphs generating similar observational data. It introduces a "kernel obstruction theorem" to formalize this intrinsic limitation of current learning paradigms.

LLMs research Causal Discovery machine learning

RESEARCHarXiv CS.LG·14d ago

TSFMAudit: Data Contamination Auditing in Forecasting Time Series Foundation Models

This work introduces TSFMAudit, a novel method for auditing data contamination in Time Series Foundation Models (TSFMs) during pretraining. It detects when evaluation datasets have been unduly exposed, leading to overly optimistic performance estimates, by observing unusually efficient adaptation during fine-tuning. The study evaluates TSFMAudit on 6 TSFMs and 187 datasets, addressing a previously unstudied challenge in pretraining contamination auditing for TSFMs.

time-series-models data-auditing security machine learning

DOCDEV.to AI·4/24/2026

Visualizing Data using GTSNE

This content explores data visualization using GTSNE, an advanced technique for reducing the dimensionality of complex datasets. It details how to apply GTSNE to reveal intrinsic patterns and structures in high-dimensional data, facilitating interpretation and analysis.

Dimensionality Reduction machine learning AI data visualization

RESEARCHarXiv CS.CL·20d ago

Data Scaling as Progressive Coverage of a Predictive Contribution Spectrum

This research investigates whether real-data scaling laws are governed by a progressive coverage of a latent predictive contribution spectrum, rather than solely by token-frequency. Using a suffix-automaton and a global-KL predictive contribution spectrum, the study finds a strong correlation between the spectrum's tail slope and the data-scaling exponent of GPT learners, showing that effective truncation rank scales logarithmically.

language models data scaling machine learning predictive models

RESEARCHarXiv CS.AI·23d ago

Fair outputs, Biased Internals: Causal Potency and Asymmetry of Latent Bias in LLMs for High-Stakes Decisions

This research paper explores the disconnect between fair outputs of language models and their latent internal biases in high-stakes decisions like mortgage underwriting. It demonstrates that while LLMs may show no output bias, they retain and amplify demographic representations which can cause decision reversals, and this bias is asymmetric.

LLM bias machine learning causality AI ethics

RESEARCHarXiv CS.LG·6d ago

Early Detection of Alzheimer's Disease Using Explainable Machine Learning on Clinical Biomarkers: A Multi-Class Classification Study Using the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset

An XGBoost classifier was developed using clinical features from the ADNI dataset for multi-class detection of normal cognition, mild cognitive impairment, and Alzheimer's disease. The model achieved a high mean macro AUC of 0.983 and an accuracy of 0.944, with SHAP values providing feature explainability.

machine learning Alzheimer's disease Explainable AI XGBoost

RESEARCHarXiv CS.AI·8d ago

Universal Quantum Transformer

The Universal Quantum Transformer (UQT) is a novel quantum-native computing architecture designed to overcome classical neural networks' struggles with exact mathematical symmetries. It leverages physical properties of multi-qubit systems for precise mathematical and algebraic reasoning, demonstrating perfect learning of cyclic modular arithmetic on a compact 5-qubit substrate.

Quantum Computing neural networks AI architecture machine learning

ARTICLEDEV.to AI·4/19/2026

MLOps in 2026: Production Machine Learning Best Practices

This article analyzes MLOps in 2026, focusing on best practices for production Machine Learning, core concepts, and tools. It details industry growth and key statistics for mainstream adoption by then.

MLOps Production machine learning best practices

ARTICLEDEV.to AI·4/14/2026

How to Become a Certified AI Project Manager quickly

This article outlines a step-by-step approach to quickly become a certified AI Project Manager, a crucial role in managing AI initiatives from concept to deployment. It highlights the need to understand foundational AI concepts like Machine Learning and NLP to bridge data scientists, developers, and business stakeholders effectively.

certification AI project management machine learning Career Path

ARTICLEDEV.to AI·4d ago

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Agent Lightning is a framework designed to train any AI agents using Reinforcement Learning. It aims to simplify and accelerate the process of developing and optimizing intelligent agents.

reinforcement learning AI Training machine learning AI agents

RESEARCHarXiv CS.LG·4/16/2026

Adaptive Memory Crystallization for Autonomous AI Agent Learning in Dynamic Environments

This research introduces Adaptive Memory Crystallization (AMC), a novel memory architecture designed for autonomous AI agents to progressively consolidate experiences in dynamic environments without forgetting prior knowledge. AMC models memory as a continuous crystallization process across a three-phase hierarchy, inspired by synaptic tagging and capture theory and governed by stochastic differential equations.

reinforcement learning machine learning memory architecture AI agents

RESEARCHarXiv CS.LG·4/16/2026

Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates

This paper introduces the Langevin Gradient Descent (LGD) algorithm for convex regression problems, proving that optimal hyperparameter configurations achieve the Bayes' optimal solution. The work also provides generalization guarantees for meta-learning LGD's optimal hyperparameters, with a pseudo-dimension bound of O(dh).

Meta-Learning Optimization Generalization Hyperparameter Tuning

DOCDEV.to AI·4d ago

Decision Trees — A Beginner Technical Guide

Decision Trees are intuitive machine learning models that mimic human decision-making processes by asking a sequence of yes/no questions. They are fundamental not only as standalone models but also as the basis for more complex algorithms in modern machine learning.

decision trees learning machine learning data science

ARTICLEDEV.to AI·27d ago

Everything Google announced at its Android Show, from Googlebooks to vibe-coded widgets

The article technically analyzes Google's Android Show announcements, focusing on the new Google Books app and vibe-coded widgets. It details how Google Books uses a proprietary rendering engine with ML for text recognition, while vibe-coded widgets leverage NLP and computer vision via TensorFlow Lite for personalized experiences.

Android machine learning computer vision Natural Language Processing

RESEARCHarXiv CS.LG·5/7/2026

Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search

This paper introduces Structured Progressive Knowledge Activation (SPARK) to address the challenge of integrating architectural knowledge in LLM-driven Neural Architecture Search (NAS). SPARK mitigates "functional entanglement" by enabling factor-conditioned editing, leading to more targeted and reliable architecture modifications.

Neural Architecture Search machine learning Knowledge Integration large language models

RESEARCHarXiv CS.LG·4/22/2026

Towards Understanding the Robustness of Sparse Autoencoders

This research explores the robustness implications of Sparse Autoencoders (SAEs) against jailbreak attacks on Large Language Models (LLMs). Integrating pretrained SAEs at inference time significantly reduces jailbreak success rates by up to 5x and decreases cross-model attack transferability across various LLM families.

LLMs security machine learning

ARTICLEDEV.to AI·15d ago

ความหมายของ 'ความหมาย': เมื่อ AI ค้นหาเส้นแบ่งระหว่างการจดจำกับภาพลวง

This article delves into how AI 'undrstands meaning' compared to humans, through the lens of neuroscience, AI ethics, and constrained creativity. The core philosophical and technical question is whether AI genuinely 'understands meaning' or merely creates an illusion of continuity, unlike human memory which involves continuous selection and interpretation.

cognitive science machine learning Neuroscience philosophy of AI

RESEARCHarXiv CS.LG·4/22/2026

Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling

This work addresses the challenge of missing modalities in multimodal clinical data for diagnosis by reframing it as an autoregressive sequence modeling task. It leverages causal decoders from LLMs and a missingness-aware contrastive pre-training to outperform baselines on benchmarks like MIMIC-IV and eICU.

multimodal AI machine learning large language models healthcare AI

ARTICLEDEV.to AI·27d ago

Lambda — Deep Dive

Lambda is a specialized AI infrastructure provider focused on GPU compute and machine learning tooling, carving a critical niche in the AI hardware landscape. Unlike generalist hyperscalers, the company's mission is to enable seamless transitions from prototypes to massive production workloads for its diverse customer base.

GPU compute deep learning cloud computing machine learning

RESEARCHarXiv CS.LG·20d ago

Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine

This paper provides a theoretical explanation for the efficiency of diffusion models in learning the score function for high-dimensional data supported on low-dimensional manifolds. It identifies a "collapse-and-refine" mechanism driven by the geometry of the score function, where the denoising map projects onto the data manifold and refines the intrinsic density.

Diffusion Models Theoretical AI machine learning Manifold Learning