Optimization

134 items

RESEARCHarXiv CS.AI·8d ago

Structure-Induced Information for Rerooting Levin Tree Search

This paper introduces novel rerooter designs for the $\sqrt{\text{LTS}}$ algorithm, addressing the scalability limitations of explicit subgoal generation in subgoal-based policy tree search. These designs implicitly decompose problems, enabling scalable allocation of search effort.

policy search Optimization tree search machine learning

RESEARCHarXiv CS.CL·12d ago

EvoSpec: Evolving Speculative Decoding via Real-Time Vocabulary and Parameter AdaptationTarget

EvoSpec introduces a framework for real-time evolution of draft models in speculative decoding for Large Language Models, addressing the bottleneck of large vocabulary sizes. It uses dynamic vocabulary and parameter adaptation, employing a context-aware mechanism and a lightweight online alignment strategy to improve acceptance rates and minimize distributional gaps.

Optimization machine learning large language models AI inference

RESEARCHarXiv CS.CL·13d ago

In-Context Optimization for Retrieval-Augmented Generation: A Gradient-Descent Perspective

This research paper explores Retrieval-Augmented Generation (RAG) through the lens of in-context optimization. It demonstrates that a single linear self-attention layer can execute a gradient-descent step on a unified linearized RAG objective, revealing an exact regime where retrieval-augmented prediction and in-context optimization align.

Optimization RAG machine learning NLP

RESEARCHDEV.to AI·4/14/2026

Graph Partitioning using Quantum Annealing on the D-Wave System

This content explores the application of quantum annealing, specifically on the D-Wave system, to solve graph partitioning problems. It delves into leveraging quantum computation for complex combinatorial optimization challenges.

Quantum Computing Optimization Graph Partitioning Quantum Annealing

ARTICLEDEV.to AI·29d ago

Training an LLM in Swift: Understanding Faster Matrix Multiplication

This article explores optimizing matrix multiplication, a fundamental operation in AI tasks, to accelerate LLM training using Swift. The goal is to boost calculations from gigaflops to teraflops, making language understanding and other AI tasks significantly faster and more efficient.

Optimization Matrix Multiplication Swift AI

DOCDEV.to AI·4/24/2026

Derivatives: Understanding Change

This content explains how derivatives are crucial in AI for optimizing model performance, by measuring the impact of weight adjustments on prediction loss. It describes how to guide the model to learn by nudging its weights in the direction that reduces loss.

neural networks Gradient Descent Optimization machine learning

ARTICLEDEV.to AI·17d ago

MCPs Are Eating Your Context Window (And What To Do About It)

This article explores how Model Context Protocol (MCP) servers consume an AI model's context window by front-loading tool schemas, leading to high token usage. It suggests that "skills" can solve this problem by lazily loading tools, thereby optimizing cost and efficiency.

Optimization API Token usage AI agents

ARTICLEDEV.to AI·23d ago

We tried routing between 4 different LLMs automatically – here's what we learned

An experiment explored routing AI queries to different LLMs (DeepSeek-V4 Pro, Kimi 2.6, MiniMax 2.7, Qwen3 235B) based on task. It found no single best model, with simple YAML rules proving effective, while complex routing and cost prediction failed.

AI models Optimization LLMs routing

ARTICLEDEV.to AI·4/27/2026

Context Compression in .NET

This quick tip explains how to implement context compression in .NET for RAG systems, addressing the lack of a direct equivalent to tools like LLMLingua. It proposes using a smaller, cheaper worker model to pre-process retrieved documentation, extracting only essential facts to reduce cost and latency with premium AI models.

Optimization prompt-engineering RAG AI

ARTICLEDEV.to AI·4/24/2026

"AI-powered inventory management for small retail businesses: How to reduce stoc

This article explores how AI-powered inventory management can revolutionize small retail businesses. It details the benefits of accurately forecasting demand using multiple factors to avoid stockouts and overstocking.

AI applications Optimization business efficiency retail

RESEARCHDEV.to AI·4/21/2026

Multi-Objective Deep Reinforcement Learning

This content explores the field of Multi-Objective Deep Reinforcement Learning. It likely delves into techniques for training AI agents to optimize multiple performance criteria concurrently.

Optimization deep learning reinforcement learning

ARTICLEDEV.to AI·4/25/2026

"AI-Powered HVAC Contractor Lead Scoring & Dispatch Optimization Suite with Low-

This report details how AI-powered lead scoring and dispatch optimization can boost efficiency and conversion rates for HVAC contractors. It outlines a low-barrier implementation plan, backed by industry data and trends.

lead management HVAC Optimization AI

RESEARCHarXiv CS.AI·4/6/2026

Interpretable Deep Reinforcement Learning for Element-level Bridge Life-cycle Optimization

O artigo aborda a aplicação de Aprendizado por Reforço Profundo interpretável para a otimização do ciclo de vida de pontes em nível de elemento. Ele busca oferecer transparência e eficiência na gestão da infraestrutura.

Deep Reinforcement Learning Optimization interpretable AI Civil Engineering

RESEARCHarXiv CS.LG·4/6/2026

Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers

Este estudo caracteriza a sobrecarga de despacho do WebGPU para inferência de LLM em diversas plataformas de GPU, backends e navegadores. Ele revela que benchmarks simples superestimam os custos e identifica o verdadeiro custo por despacho da API WebGPU, destacando a necessidade dessa distinção para otimizações eficazes.

neural networks Optimization browsers Overhead

RESEARCHarXiv CS.AI·4/30/2026

Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas

This paper proposes a hierarchical framework to induce multiple evidence-grounded user personas from behavioral logs by clustering intent memories and optimizing persona quality. The method utilizes a groupwise extension of Direct Preference Optimization (DPO) and demonstrates more coherent, truthful personas, also improving future interaction prediction.

Optimization LLMs machine learning persona generation

RESEARCHarXiv CS.AI·5/6/2026

Accelerating battery research with an AI interface between FINALES and Kadi4Mat

This study optimizes sodium-ion coin cell formation protocols for duration efficiency and end-of-life performance, utilizing an AI interface between FINALES and Kadi4Mat. The framework employs multi-objective batched Bayesian optimization to guide experiment selection, aiming to accelerate discovery and reduce resource consumption.

Materials Science Optimization machine learning AI

ARTICLETogether AI Blog·4/24/2026

Accelerate RL rollouts by up to 50% with distribution-aware speculative decoding

DAS (distribution-aware speculative decoding) addresses the rollout bottleneck in RL post-training. It accelerates rollouts by up to 50% without compromising reward quality.

Optimization AI acceleration reinforcement learning machine learning

ARTICLETogether AI Blog·8d ago

Serving MiniMax-M3 for efficient inference: Unlocking 1M-Token Context and Multimodality Without Regrets

Together achieved efficient inference for MiniMax-M3, unlocking 1M-token context and multimodality. This was accomplished through KV-block-major sparse attention, paged MSA decode, optimized index scoring, and a Rust-based multimodal gateway.

System Design Optimization Multimodality large language models

RESEARCHarXiv CS.AI·4/14/2026

Linear Programming for Multi-Criteria Assessment with Cardinal and Ordinal Data: A Pessimistic Virtual Gap Analysis

This paper introduces novel linear programming-based Virtual Gap Analysis (VGA) models for multi-criteria assessment, addressing issues of subjective evaluations and data diversity. The two-step method assesses alternatives pessimistically using cardinal and ordinal data, enabling efficient ranking and elimination of unfavorable options within decision support systems.

Optimization Decision Making Linear Programming Multi-Criteria Analysis

RESEARCHarXiv CS.AI·4/22/2026

On Solving the Multiple Variable Gapped Longest Common Subsequence Problem

This paper tackles the Variable Gapped Longest Common Subsequence (VGLCS) problem, a generalization of LCS with flexible gap constraints, relevant to molecular sequence comparison and time-series analysis. It proposes a root-based state graph search framework combined with an iterative beam search strategy to manage combinatorial explosion and find high-quality solutions.

search algorithms Optimization Algorithms Time Series Analysis