← heapsort-ai

machine learning

790 items

RESEARCHarXiv CS.AI·4/15/2026

WiseOWL: A Methodology for Evaluating Ontological Descriptiveness and Semantic Correctness for Ontology Reuse and Ontology Recommendations

WiseOWL proposes a systematic methodology with scoring and guidance for selecting ontologies for reuse, addressing the challenge of inconsistent selection criteria. It evaluates four key metrics—documentation, label-definition alignment (using state-of-the-art embeddings), interconnectedness, and hierarchical balance—providing normalized scores and actionable feedback.

27
RESEARCHarXiv CS.LG·4/28/2026

Conformal PM2.5 Mapping Under Spatial Covariate Shift: Satellite-Reanalysis Fusion for Africa's Green Industrial Transition

This paper introduces a satellite-reanalysis PM2.5 fusion system for air quality monitoring in Africa, employing LightGBM and conformal prediction. The system addresses challenges in geographic generalization and uncertainty quantification crucial for the continent's green industrial transition.

27
RESEARCHarXiv CS.LG·4/13/2026

Structured Exploration and Exploitation of Label Functions for Automated Data Annotation

This paper introduces EXPONA, an automated framework for programmatic labeling that addresses the challenges of costly and error-prone manual data annotation. EXPONA systematically explores multi-level label functions and applies reliability-aware mechanisms to generate high-quality weak labels for training AI models.

27
RESEARCHarXiv CS.AI·4/25/2026

Adaptive Test-Time Compute Allocation with Evolving In-Context Demonstrations

This work introduces an innovative framework for adaptive test-time compute allocation, jointly adjusting where computation is spent and how generation is performed. The method uses a warm-up phase to identify easy queries and then concentrates further computation on unresolved queries, reshaping generation distributions with evolving in-context demonstrations.

27
RESEARCHarXiv CS.LG·5/5/2026

PhaseNet++: Phase-Aware Frequency-Domain Anomaly Detection for Industrial Control Systems via Phase Coherence Graphs

PhaseNet++ introduces a novel frequency-domain autoencoder for anomaly detection in Industrial Control Systems (ICS), addressing the overlooked phase spectrum in multivariate time series analysis. It utilizes a Phase Coherence Index to guide a graph attention network for enhanced detection of cyber-physical attacks.

27
RESEARCHarXiv CS.CL·4/10/2026

TR-EduVSum: A Turkish-Focused Dataset and Consensus Framework for Educational Video Summarization

Este estudo apresenta o dataset TR-EduVSum, focado em vídeos educacionais turcos, e propõe o método AutoMUP. Este método gera resumos padrão-ouro de forma automática e reproduzível a partir de múltiplos resumos humanos, usando agrupamento de unidades de significado e modelagem estatística de consenso.

27
RESEARCHarXiv CS.LG·4/28/2026

BiTA: Bidirectional Gated Recurrent Unit-Transformer Aggregator in a Temporal Graph Network Framework for Alert Prediction in Computer Networks

This research introduces BiTA, a novel Bidirectional Gated Recurrent Unit-Transformer Aggregator, designed to enhance proactive alert prediction in computer networks. It redesigns temporal aggregation in Temporal Graph Neural Networks to capture complex, multi-scale temporal patterns by jointly encoding bidirectional sequential dependencies and long-range contextual relations.

27
RESEARCHarXiv CS.LG·5/5/2026

From Euler to Dormand-Prince: ODE Solvers for Flow Matching Generative Models

This research paper systematically benchmarks four classical ODE solvers (Euler, Explicit Midpoint, RK4, Dormand-Prince 5(4)) for Flow Matching generative models, implementing them from scratch in PyTorch. It quantitatively compares their efficiency on tasks from 2D distributions to MNIST, showing RK4 at 80 function evaluations achieves sample quality comparable to Euler at 200, and observes Jacobian eigenvalue spectrum stiffening near t=1.

27
RESEARCHarXiv CS.AI·5/1/2026

Unsupervised Electrofacies Classification and Porosity Characterization in the Offshore Keta Basin Using Wireline Logs

This study applies an unsupervised machine learning workflow, specifically K-means clustering, for electrofacies analysis and porosity characterization in offshore basin wireline log data. The methodology identified four distinct electrofacies with moderate separation, providing a robust log-only approach for geological interpretation where core data is scarce.

27
RESEARCHarXiv CS.LG·4/27/2026

Performance Anomaly Detection in Athletics: A Benchmarking System with Visual Analytics

This research presents a system for detecting suspicious performance patterns in athletics, using 1.6 million performances and eight methods including machine learning and trajectory analysis. It aims to complement traditional anti-doping programs by identifying potential violations through data analysis, with trajectory-based methods proving most effective.

27
RESEARCHarXiv CS.LG·5/1/2026

Simple Self-Conditioning Adaptation for Masked Diffusion Models

Masked diffusion models (MDMs) discard clean-state predictions for tokens that remain masked, limiting cross-step refinement. This paper proposes Self-Conditioned Masked Diffusion Models (SCMDM), a post-training adaptation that conditions each denoising step on the model's own previous clean-state predictions. This enhances performance without significant architectural changes or extra evaluations.

27