machine learning

790 items

RESEARCHarXiv CS.LG·5/4/2026

Learning physically grounded traffic accident reconstruction from public accident reports

This paper presents a method for traffic accident reconstruction from public reports and scene measurements, formulating it as a parameterized multimodal learning problem. Researchers created the CISS-REC dataset with 6,217 real-world cases and developed a framework that outperforms baselines in reconstruction fidelity, including accident point accuracy.

machine learning Reconstruction data analysis Traffic Accidents

RESEARCHarXiv CS.LG·5/4/2026

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent with Predictable Virtual Noise

This paper introduces predictable history-adaptive virtual perturbations to enhance information-theoretic generalization bounds for Stochastic Gradient Descent. This new approach allows perturbation covariances to dynamically depend on past SGD history, addressing limitations of existing methods that require fixed covariances.

information theory Optimization Generalization machine learning

RESEARCHarXiv CS.CL·4/21/2026

Data Mixing for Large Language Models Pretraining: A Survey and Outlook

This paper provides a comprehensive survey on data mixing for Large Language Model (LLM) pretraining, a crucial factor for training efficiency and downstream generalization. It formalizes data mixture optimization as a bilevel problem and introduces a fine-grained taxonomy for existing methods.

data optimization pretraining machine learning large language models

ARTICLEMIT Tech Review AI·28d ago

World Models: 10 Things That Matter in AI Right Now

World Models are featured in "10 Things That Matter in AI Right Now", with executive editor Niall Firth explaining their rising importance. A subscriber-only discussion by MIT Technology Review will delve into how AI may evolve to understand the world.

AI trends machine learning World Models MIT Technology Review

RESEARCHarXiv CS.LG·18d ago

Temporal Contrastive Transformer for Financial Crime Detection: Self-Supervised Sequence Embeddings via Predictive Contrastive Coding

The Temporal Contrastive Transformer (TCT) is a new representation learning framework designed for financial transaction sequences to detect fraud. It uses self-supervised contrastive learning to generate embeddings that capture temporal behavioral patterns, showing meaningful predictive performance, especially when combined with domain-engineered features.

Financial AI security machine learning fraud detection

RESEARCHarXiv CS.LG·18d ago

Double descent for least-squares interpolation on contaminated data: A simulation study

This research investigates the "double descent" phenomenon in overparametrized models, which allows for improved generalization despite classical overfitting concerns. The study specifically explores this effect in linear regression with contaminated training data, finding that significant overparametrization enables double descent even in such robust settings.

robustness double descent machine learning overfitting

RESEARCHarXiv CS.LG·18d ago

Harnesses for Inference-Time Alignment over Execution Trajectories

This research investigates harness engineering as an inference-time technique for large language model (LLM) agents, focusing on improving long-term performance via task decomposition and guided execution. It quantifies how design elements like workflow granularity and guidance impact performance, revealing common failure modes such as over-decomposition and hallucinated execution.

inference LLMs machine learning Task Decomposition

RESEARCHarXiv CS.LG·4/24/2026

ILDR: Geometric Early Detection of Grokking

This paper proposes the Inter/Intra-class Distance Ratio (ILDR) as a novel geometric signal for the early detection of 'grokking' in neural networks. ILDR, computed from second-to-last layer representations, indicates geometric reorganization in representation space before validation accuracy improves, outperforming existing detection methods.

neural networks grokking machine learning early-detection

RESEARCHarXiv CS.LG·4/24/2026

Reinforcing privacy reasoning in LLMs via normative simulacra from fiction

This paper proposes a novel method to enhance privacy reasoning in LLMs by extracting normative simulacra from fiction novels. The approach involves fine-tuning LLMs via supervised learning followed by GRPO reinforcement learning, using a composite reward function to align information handling practices with user privacy expectations.

LLMs privacy security machine learning

RESEARCHarXiv CS.CL·5/7/2026

FMI_SU_Yotkova_Kastreva at SemEval-2026 Task 13: Lightweight Detection of LLM-Generated Code via Stylometric Signals

This paper details participation in SemEval-2026 Task 13, focusing on lightweight detection of LLM-generated code using stylometric signals. The approach employs ratio-based features, parsing engines, and language classifiers, proving computationally efficient with near-instant inference time.

security machine learning Natural Language Processing Code Analysis

RESEARCHarXiv CS.CL·5/7/2026

Adapt to Thrive! Adaptive Power-Mean Policy Optimization for Improved LLM Reasoning

This research introduces Adaptive Power-Mean Policy Optimization (APMPO) to improve Large Language Model (LLM) reasoning capabilities within Reinforcement Learning with Verifiable Rewards (RLVR). APMPO combines a generalized power-mean objective and feedback-adaptive clipping to enhance learning dynamics and performance, addressing limitations of static optimization schemes.

Policy optimization LLMs reinforcement learning machine learning

RESEARCHarXiv CS.LG·26d ago

EMA: Efficient Model Adaptation for Learning-based Systems

This paper introduces EMA, the first model adaptation system designed to help learning-based systems adapt to evolving environments with minimal operational overhead. It addresses costly model training and extensive data collection challenges by reducing expensive model training.

model adaptation system optimization learning machine learning

RESEARCHarXiv CS.LG·29d ago

Feature Repulsion and Spectral Lock-in: An Empirical Study of Two-Layer Network Grokking

This empirical study investigates Tian's (2025) feature repulsion theorem in two-layer network grokking, testing its mechanisms and spectral signatures. It observes a clear structure-mechanism dissociation, with the predicted sign rule robustly holding for similar feature pairs despite a strong activation dependence in the spectral signature.

neural networks feature learning grokking deep learning

RESEARCHarXiv CS.CL·26d ago

Distribution Corrected Offline Data Distillation for Large Language Models

This research proposes an offline reasoning distillation framework for Large Language Models (LLMs) to enhance intelligence in resource-constrained environments. It tackles the distributional drift issue in existing offline methods by correcting teacher-student discrepancies while preserving data efficiency and supervision quality.

Data Distillation Offline Distillation machine learning large language models

RESEARCHarXiv CS.LG·26d ago

Beyond Mode-Seeking RL: Trajectory-Balance Post-Training for Diffusion Language Models

This paper introduces TraFL, a novel post-training approach for diffusion language models that addresses "trajectory locking" observed in reward-maximizing methods. TraFL, a trajectory-balance objective, outperforms other methods across mathematical reasoning and code generation benchmarks.

Diffusion Models language models reinforcement learning machine learning

RESEARCHarXiv CS.LG·8d ago

Generative AI and Digital Ecosystem Resilience: A Proactive Lifecycle-Based Survey

This survey addresses the proliferation of adversarial synthetic content, accelerated by Generative AI, which renders traditional reactive detection methods ineffective. It proposes a paradigm shift towards proactive detection of emerging inauthentic narratives, adopting a unified, lifecycle-based taxonomy integrating socio-technical models and advanced computational methodologies.

security machine learning digital ecosystem inauthentic narratives

RESEARCHarXiv CS.LG·26d ago

Rethinking Molecular OOD Generalization via Target-Aware Source Selection

This research addresses challenges in robust molecular property prediction under extreme out-of-distribution (OOD) scenarios crucial for AI-driven drug discovery. It proposes SCOPE-BENCH, a new benchmark for OOD performance evaluation, and POMA, a framework for multi-source adaptation to overcome limitations of existing methods.

Out-of-Distribution Molecular AI machine learning drug discovery

RESEARCHarXiv CS.LG·8d ago

Hoeffding Concept Bottleneck Models with Applications to Overhead Images

Hoeffding Concept Bottleneck Models (HCBM) are introduced to offer non-linear and sparse aggregations of concept scores, enhancing the explainability and accuracy of deep learning predictions. This method leverages Hoeffding functional decomposition of gradient-boosted trees to overcome the limitations of existing linear CBMs, which suffer from a large number of concepts and potential information leakage.

deep learning machine learning computer vision Explainable AI

RESEARCHarXiv CS.LG·29d ago

TTCD:Transformer Integrated Temporal Causal Discovery from Non-Stationary Time Series Data

The Transformer Integrated Temporal Causal Discovery (TTCD) Framework is a novel end-to-end approach designed to learn contemporaneous and lagged causal relations from complex non-stationary time series data. This method addresses the limitations of existing techniques by integrating temporal and frequency-domain attention, providing a unified solution for challenging real-world scenarios.

Causal Discovery machine learning non-stationary data Time Series

RESEARCHarXiv CS.AI·29d ago

PLACO: A Multi-Stage Framework for Cost-Effective Performance in Human-AI Teams

PLACO is a multi-stage framework designed for cost-effective performance in Human-AI teams, particularly for classification tasks. The paper addresses the crucial combination of human and model output, building upon prior work that used Bayes rule.

Classification human-AI collaboration machine learning Performance optimization