machine learning

790 items

ARTICLEDeepLearning.AI (YouTube)·20d ago

AI Dev 26 x SF | Eda Zhou & Mahdi Ghodsi: Building Personal AI Agents with Open Source Models

This content discusses building personal AI agents using open-source models, as presented by Eda Zhou and Mahdi Ghodsi. It explores the approaches and technologies involved in developing personalized AI solutions.

open source models machine learning AI development Personal AI

AI Dev 26 x SF | Eda Zhou & Mahdi Ghodsi: Building Personal AI Agents with Open Source Models

RESEARCHDEV.to AI·4/22/2026

Convergence Analysis for Rectangular Matrix Completion Using Burer-MonteiroFactorization and Gradient Descent

This research paper focuses on the theoretical convergence analysis of algorithms for rectangular matrix completion. It specifically investigates the combination of Burer-Monteiro factorization and gradient descent methods.

Gradient Descent Optimization Matrix Completion machine learning

RESEARCHDEV.to AI·4/22/2026

Algorithms, Initializations, and Convergence for the Nonnegative MatrixFactorization

This content delves into Nonnegative Matrix Factorization (NMF), exploring various algorithms, initialization strategies, and their impact on convergence. It provides a detailed analysis of how these factors influence the performance and stability of NMF solutions.

machine learning data science Algorithms

DOCAWS Machine Learning Blog·27d ago

Build financial document processing with Pulse AI and Amazon Bedrock

This post demonstrates how to build a document extraction and model fine-tuning pipeline for complex financial documents, combining Pulse AI's capabilities with Amazon Bedrock services. Organizations can achieve enterprise-grade accuracy and extract contextually relevant financial insights at scale.

Financial services machine learning Amazon Bedrock document processing

RESEARCHarXiv CS.AI·4/16/2026

Optimizing Earth Observation Satellite Schedules under Unknown Operational Constraints: An Active Constraint Acquisition Approach

This paper addresses Earth Observation satellite scheduling under unknown operational constraints, which must be learned interactively from a binary oracle. The authors introduce Conservative Constraint Acquisition (CCA), a domain-specific procedure, to efficiently identify justified constraints for a simplified model.

Optimization machine learning Constraint Acquisition satellite scheduling

RESEARCHarXiv CS.CL·5/1/2026

Selective Augmentation: Improving Universal Automatic Phonetic Transcription via G2P Bootstrapping

This research proposes Selective Augmentation, a bootstrapping method to improve universal automatic phonetic transcription (APT) by selectively transferring linguistic distinctions to address limited high-quality training data. Exemplified with the MultIPA model, the approach enhanced plosive voicing accuracy by 17.6% and introduced aspiration recognition using data augmented from a helper language like Hindi.

machine learning phonetics Data Augmentation Speech Recognition

RESEARCHarXiv CS.LG·5/1/2026

Monitoring Neural Training with Topology: A Footprint-Predictable Collapse Index

A new topology-aware monitor, the Collapse Index (CI), is proposed to detect representational collapse early in neural training. It uses fast, incremental updates to provide a low-latency early-warning signal for interventions in LLM fine-tuning and KGE training.

neural networks monitoring topology model training

RESEARCHarXiv CS.CL·5/1/2026

Targeted Linguistic Analysis of Sign Language Models with Minimal Translation Pairs

The paper introduces ASL-MTP, a new benchmark dataset for analyzing how well sign language models capture linguistic phenomena and utilize multi-articulator cues. It uses this dataset to conduct a targeted linguistic analysis of a state-of-the-art ASL-to-English translation model.

machine learning Sign Language AI benchmarking Natural Language Processing

RESEARCHarXiv CS.LG·4/17/2026

Explainable Graph Neural Networks for Interbank Contagion Surveillance: A Regulatory-Aligned Framework for the U.S. Banking Sector

The ST-GAT framework provides an explainable Graph Neural Network solution for early detection of bank distress and interbank contagion surveillance in the U.S. banking sector. It models over 8,000 FDIC institutions using dynamic graphs, achieving high performance (AUPRC 0.939) and identifying key predictive factors like ROA and NPL Ratio.

Graph Neural Networks machine learning banking Explainable AI

RESEARCHarXiv CS.LG·4/22/2026

FASE : A Fairness-Aware Spatiotemporal Event Graph Framework for Predictive Policing

FASE is a Fairness-Aware Spatiotemporal Event Graph framework designed to integrate crime prediction with fairness-constrained patrol allocation to mitigate racial disparities in predictive policing. It utilizes a spatiotemporal graph neural network and a multivariate Hawkes process to model crime incidents in Baltimore, addressing data bias through a closed-loop deployment simulator.

Predictive Policing machine learning AI ethics fairness

RESEARCHarXiv CS.LG·23d ago

Reducing the Safety Tax in LLM Safety Alignment with On-Policy Self-Distillation

This paper introduces on-policy self-distillation (OPSA) to reduce the "safety tax" in LLM safety alignment. OPSA addresses the distributional mismatch of off-policy training by having the model generate its own rollouts and receive dense per-token KL supervision from a frozen teacher.

LLMs machine learning alignment AI safety

RESEARCHarXiv CS.CL·16d ago

Learnability-Informed Fine-Tuning of Diffusion Language Models

This research introduces LIFT, a learnability-informed fine-tuning algorithm designed to enhance the reasoning capabilities of diffusion language models. LIFT addresses the shortcomings of standard SFT by adaptively learning tokens based on their difficulty and available context during different diffusion time steps, showing improved performance over existing baselines.

Diffusion Models learning machine learning Natural Language Processing

RESEARCHarXiv CS.LG·6d ago

Inverse Critical Experiment Design via Gradient Optimization and a Multigroup Attention-Based Neural Network Architecture

This research presents a methodology for the inverse design of critical experiments, essential for validating advanced nuclear reactor designs. It employs deep neural network surrogate modeling and nonparametric gradient optimization to generate experiment geometries that maximize neutronic similarity.

neural networks Optimization nuclear engineering machine learning

RESEARCHarXiv CS.LG·6d ago

Unlocking Feature Learning in Gated Delta Networks at Scale

This paper derives scaling rules for Gated Delta Networks to address the computational demands of training and scaling Large Language Models. Experiments validate that these configurations enable stable learning-rate transfer across various model widths, unlike standard parametrization.

neural networks learning Hyperparameter Tuning machine learning

RESEARCHarXiv CS.LG·6d ago

Novel Aspects of IEEE SA P3109 Arithmetic Formats for Machine Learning

The IEEE P3109 draft standard defines parameterized binary floating-point formats and associated operations, specifically designed to facilitate machine learning by allowing efficient representation in few bits. It ensures exception-free operations by explicit treatment of NaN and infinities, with results communicated via return values.

Floating-Point Arithmetic Data Representation machine learning Precision

RESEARCHarXiv CS.LG·8d ago

DAStatFormer: A Hybrid Multibranch Transformer with Statistical Feature Integration for DAS-Based Pattern Recognitions

DAStatFormer is a hybrid multibranch Transformer proposed to overcome the challenges of high dimensionality and complex spatio-temporal patterns in Distributed Acoustic Sensing (DAS). It integrates compact statistical features from multiple domains, significantly reducing data size and enhancing event classification.

deep learning machine learning pattern recognition distributed acoustic sensing

RESEARCHarXiv CS.CL·8d ago

Toward Robust In-Context Learning: Leveraging Out-of-distribution Proxies for Target Inaccessible Demonstration Retrieval

This paper introduces DOPA, a demonstration search framework for robust in-context learning with Large Language Models (LLMs). DOPA uses an OOD proxy to approximate inaccessible target domains and a Mahalanobis distance-based global diversity constraint for demonstration retrieval.

LLMs learning machine learning in-context learning

RESEARCHDEV.to AI·4/12/2026

A Neural Network based Approach for Predicting Customer Churn in CellularNetwork Services

This work proposes a neural network-based approach to predict customer churn in cellular network services. The objective is to identify user behavior patterns to anticipate service abandonment.

Telecommunications machine learning data science customer churn prediction

ARTICLEDEV.to AI·4/16/2026

9 Python Libraries to Supercharge Your Feature Engineering Efficiency

This content emphasizes the critical role of feature engineering in machine learning model performance and the challenges posed by large-scale datasets. It introduces 9 specialized Python libraries, including NVTabular, designed to enhance efficiency and leverage GPU acceleration for massive data processing.

Feature Engineering machine learning AI Python

ARTICLEDEV.to AI·4/11/2026

Deep Learning on FPGAs: Past, Present, and Future

This article reviews the evolution of Deep Learning implementation on FPGAs, covering its historical development, current state, and future directions. It also highlights the critical importance of hardware acceleration for the advancement of artificial intelligence.

Hardware Acceleration FPGAs deep learning machine learning