← heapsort-ai

machine learning

790 items

RESEARCHarXiv CS.CL·4/20/2026

DALM: A Domain-Algebraic Language Model via Three-Phase Structured Generation

DALM (Domain-Algebraic Language Model) is proposed to address knowledge interference in LLMs by replacing unconstrained generation with structured denoising over a domain lattice. It uses a three-phase generation path (domain, relation, concept uncertainty) under algebraic constraints, requiring a domain lattice, relation typing, and fiber partition to prevent cross-domain contamination.

27
RESEARCHarXiv CS.AI·4/16/2026

ReSS: Learning Reasoning Models for Tabular Data Prediction via Symbolic Scaffold

ReSS is a framework that bridges symbolic and neural reasoning models for tabular data prediction, aiming for both high accuracy and understandable reasoning. It leverages decision trees to extract symbolic scaffolds that guide an LLM to generate natural-language reasoning, which is then used to fine-tune specialized tabular reasoning LLMs.

27
RESEARCHarXiv CS.LG·4/17/2026

Portfolio Optimization Proxies under Label Scarcity and Regime Shifts via Bayesian and Deterministic Students under Semi-Supervised Sandwich Training

This paper proposes a machine learning-assisted portfolio optimization framework designed for low data environments and regime uncertainty. It uses a teacher-student pipeline where a Conditional Value at Risk (CVaR) optimizer generates labels, and neural models are trained using both real and synthetically augmented data to overcome observation scarcity.

27
ARTICLEDEV.to AI·15d ago

Self-Supervised Temporal Pattern Mining for precision oncology clinical workflows across multilingual stakeholder groups

In early 2024, the author discovered significant asymmetry in clinical data flow across oncology workflows, characterized by temporal and linguistic mismatches. This insight led to a deep dive into self-supervised temporal pattern mining for precision oncology, focusing on understanding actual clinical workflow functions.

27
ARTICLEDEV.to AI·21d ago

Inside Hoovik: Building a Real-Time Multimodal Emotion AI Pipeline

The article details the engineering challenges of building a real-time, multimodal emotion inference engine for live video meetings, which proved harder than anticipated WebRTC issues. It explains how Hoovik's emotion recognition backend was designed using technologies like FastAPI, PyTorch, and MediaPipe to operate reliably in unstable live environments.

27
RESEARCHarXiv CS.CL·4/13/2026

Uncertainty Estimation for the Open-Set Text Classification systems

This paper focuses on accurate uncertainty estimation for open-set text classification (OSTC) systems, where text samples can be classified into existing classes or rejected as unknown. It adapts the Holistic Uncertainty Estimation (HolUE) method for the text domain to capture text and gallery uncertainties, and proposes a new OSTC benchmark.

27