AI algorithms

6 items

RESEARCHarXiv CS.AI·1d ago

DiBS: Diffusion-Informed Branch Selection

The paper introduces DiBS, a novel diffusion model-guided approach for branch selection in solving Sudoku, a constraint satisfaction problem. It enhances symbolic solvers by using a diffusion model to guide branch ordering, ensuring completeness while mitigating long-tail search issues.

branch selection Diffusion Models constraint satisfaction Sudoku

RESEARCHDEV.to AI·5/1/2026

Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning

This content discusses Deep Dyna-Q, an approach that integrates planning for dialogue policy learning in conversational AI systems. The focus is on optimizing the task-completion process through spoken interaction with AI.

reinforcement learning Natural Language Processing AI algorithms dialogue systems

RESEARCHarXiv CS.CL·5/7/2026

Free Energy-Driven Reinforcement Learning with Adaptive Advantage Shaping for Unsupervised Reasoning in LLMs

FREIA is a novel reinforcement learning algorithm designed to enhance LLMs for unsupervised reasoning, addressing the lack of adaptability in existing methods. It employs Free Energy-Driven Reward (FER) to balance consensus and exploration, and Adaptive Advantage Shaping (AAS) to adjust learning signals. FREIA outperforms unsupervised baselines across various reasoning tasks, particularly in mathematical reasoning.

LLMs reinforcement learning AI algorithms Reasoning

RESEARCHarXiv CS.CL·22d ago

Always Learning, Always Mixing: Efficient and Simple Data Mixing All The Time

This research introduces OP-Mix, a novel algorithm for efficient data mixing throughout the entire lifecycle of language model training. It addresses the challenge of combining diverse data sources for pretraining, continual learning, and adaptation, proposing a unified online decision-making solution.

language models learning data mixing machine learning

RESEARCHarXiv CS.AI·8d ago

Structure-Induced Information for Rerooting Levin Tree Search

This paper introduces novel rerooter designs for the $\sqrt{\text{LTS}}$ algorithm, addressing the scalability limitations of explicit subgoal generation in subgoal-based policy tree search. These designs implicitly decompose problems, enabling scalable allocation of search effort.

policy search Optimization tree search machine learning

RESEARCHDEV.to AI·4/12/2026

LightLDA: Big Topic Models on Modest Compute Clusters

LightLDA is an innovative algorithm for efficiently building large topic models, even on modest compute clusters. It optimizes scalability and the processing of large data volumes, making advanced topic modeling more accessible.

Scalability Topic Modeling distributed computing NLP