Optimization

134 items

RESEARCHarXiv CS.LG·4/27/2026

Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

This research presents a multi-layered methodology to accelerate multimodal foundation models (MFMs) through hardware and software co-design. It employs optimization techniques like hierarchy-aware mixed-precision quantization, structural pruning, speculative decoding, and model cascading to reduce computational and memory requirements.

Optimization multimodal AI AI acceleration Foundation Models

RESEARCHarXiv CS.LG·4/17/2026

TOPCELL: Topology Optimization of Standard Cell via LLMs

TOPCELL is a novel framework that uses Large Language Models (LLMs) to optimize transistor topology in standard cell design, overcoming the limitations of traditional exhaustive search methods. By reformulating topology exploration as a generative task and employing GRPO for fine-tuning, it significantly improves the discovery of routable and physically-aware layouts for advanced technology nodes.

Optimization LLMs chip design Generative AI

ARTICLEDEV.to AI·29d ago

When I started running models locally, I thought quantization meant squeezing more into RAM. Turns o

The article advises against defaulting to Q4_K_M for local LLM inference, emphasizing that optimal performance comes from testing quantization levels tailored to specific workflows. It suggests that aggressive quantization like Q3_K_S can significantly cut latency with imperceptible quality loss for many tasks, though context length presents a trade-off.

Optimization LLMs quantization hardware

RESEARCHarXiv CS.LG·4/20/2026

Mapping High-Performance Regions in Battery Scheduling across Data Uncertainty, Battery Design, and Planning Horizons

This study presents a triadic analysis of battery scheduling under multi-stage model predictive control, investigating the interplay between data characteristics, forecast uncertainty, planning horizon, and battery c-rate. It identifies an "effective horizon" for optimal look-ahead length, enabling reduced computational costs and providing practical guidance for industrial storage operations.

Optimization Battery Technology energy management Predictive Analytics

RESEARCHarXiv CS.AI·4/20/2026

Bilevel Optimization of Agent Skills via Monte Carlo Tree Search

This research introduces a bilevel optimization framework for systematically enhancing "agent skills" in large language model (LLM) agents. It uses an outer loop of Monte Carlo Tree Search to jointly optimize the structure and content of these skills, addressing a complex decision space for improved task performance.

Optimization Monte Carlo Tree Search large language models AI agents

RESEARCHarXiv CS.LG·5/4/2026

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent with Predictable Virtual Noise

This paper introduces predictable history-adaptive virtual perturbations to enhance information-theoretic generalization bounds for Stochastic Gradient Descent. This new approach allows perturbation covariances to dynamically depend on past SGD history, addressing limitations of existing methods that require fixed covariances.

information theory Optimization Generalization machine learning

RESEARCHarXiv CS.LG·4/21/2026

Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP

This paper investigates how adaptation methods (Full FT vs. LoRA) and optimization scale jointly shape attention drift and transfer retention in fine-tuned CLIP models. A controlled matched-learning-rate comparison reveals that the learning rate strongly modulates structural change, with Full FT showing marked contraction at higher rates while LoRA remains entropy-positive.

CLIP Optimization attention Fine-tuning

RESEARCHarXiv CS.AI·25d ago

Mixed Integer Goal Programming for Personalized Meal Optimization with User-Defined Serving Granularity

This paper proposes Mixed Integer Goal Programming (MIGP) for personalized meal optimization. The methodology addresses limitations of existing formulations by using integer variables for practical serving counts and goal programming deviations for soft nutrient targets, allowing user-defined serving granularity.

nutrition Optimization meal planning operations research

RESEARCHarXiv CS.LG·22d ago

Mirror Descent-Type Algorithms for the Variational Inequality Problem with Functional Constraints

This paper addresses constrained variational inequality problems with functional constraints, proposing mirror descent-type algorithms. These algorithms are analyzed for their optimal convergence rate for problems with bounded and monotone operators and Lipschitz convex functional constraints.

Optimization machine learning variational inequalities mathematics

RESEARCHarXiv CS.AI·8d ago

Agents on a Tree: Pathwise Coordination for Multi-Objective Molecular Optimization

The paper introduces ATOM, a multi-agent framework for multi-objective molecular optimization employing a tree-structured search. Agents coordinate along different paths of the tree to maintain and compare alternative molecular evolution trajectories, supported by a global memory.

Optimization Molecular Optimization machine learning AI

RESEARCHarXiv CS.LG·5/7/2026

Lookahead Drifting Model

This paper proposes a "lookahead drifting model" for distribution mapping, which enhances image generation performance via one-step neural functional evaluation. The model computes a set of drifting terms sequentially at each training iteration, utilizing positive samples and model outputs to capture higher-order gradient information.

neural networks Optimization deep learning machine learning

RESEARCHarXiv CS.AI·8d ago

Position Paper: Post-Solve Robustness in Decision Engines: Feasible Regions and Smoothness Under Perturbations

This paper introduces a missing layer in optimization pipelines to address the post-solve robustness gap in Mixed-Integer Linear Programming (MILP) decision engines. It formalizes an epsilon-near-optimal feasible neighborhood and solution smoothness to assess how far a solved incumbent can be trusted under parameter perturbations.

robustness Optimization Perturbations Decision Engines

RESEARCHarXiv CS.LG·8d ago

Foundation-Preserving Adaptation via Generalized Rayleigh-Quotient Optimization

This paper introduces Foundation Preserving LoRA (FoLoRA), an optimization framework that addresses the degradation of nontarget capabilities during finetuning of foundation models. It uses a generalized Rayleigh quotient to balance task utility and forgetting penalty, guiding updates to preserve pretraining knowledge.

Finetuning neural networks Optimization machine learning

RESEARCHarXiv CS.LG·18d ago

DualOptim+: Bridging Shared and Decoupled Optimizer States for Better Machine Unlearning in Large Language Models

DualOptim+ is a novel optimization framework designed to improve machine unlearning in large language models by bridging shared and decoupled optimizer states. It uses base states for common representations and delta states for objective-specific residuals, also offering a quantized 8-bit variant to reduce memory overhead without compromising performance.

Optimization learning machine unlearning large language models

RESEARCHarXiv CS.LG·27d ago

Plan Before You Trade: Inference-Time Optimization for RL Trading Agents

This paper introduces FPILOT, a plugin inference-time optimization framework for reinforcement learning trading agents. It uses predicted price trajectories to optimize the policy at inference-time before executing a trade, being compatible with any pre-trained agent.

Optimization financial trading reinforcement learning AI in finance

RESEARCHarXiv CS.LG·28d ago

Vertex-Softmax: Tight Transformer Verification via Exact Softmax Optimization

The paper introduces Vertex-Softmax, a novel method for certified verification of transformer attention by exactly optimizing the softmax function. It proves that the exact optimum is attained at a vertex of the constraint box, yielding a tighter sound bound.

Optimization machine learning Verification AI

RESEARCHarXiv CS.LG·28d ago

QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization

QuIDE introduces a unified metric, the Intelligence Index I, to evaluate the efficiency of quantized neural networks by collapsing the compression-accuracy-latency trade-off. Experiments across various settings identify task-dependent optimal quantization (4-bit or 8-bit), providing a reproducible evaluation protocol and a fitness function for mixed-precision search.

neural networks Optimization machine learning AI Efficiency

RESEARCHarXiv CS.AI·21d ago

Embedding by Elicitation: Dynamic Representations for Bayesian Optimization of System Prompts

This paper introduces ReElicit, a Bayesian optimization framework based on "embedding by elicitation" for tuning system prompts in AI. It leverages LLMs to elicit an interpretable feature space and a Gaussian process surrogate to select and refine prompts based on aggregate feedback.

Bayesian Optimization Optimization System prompts machine learning

RESEARCHarXiv CS.LG·12d ago

Balancing Multimodal Learning through Label Space Reshaping

The paper addresses modality imbalance in multimodal learning, where some modalities dominate optimization while others remain undertrained. It proposes that this discrepancy stems from differing mapping difficulties between modality-specific feature space and the shared label space, introducing BMLR to equalize this difficulty.

multimodal learning Optimization learning machine learning

RESEARCHarXiv CS.LG·16d ago

FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning

This research introduces FuRA (Full-Rank Adaptation), a novel parameter-efficient fine-tuning method that addresses limitations in existing techniques by incorporating spectral preconditioning. By reparameterizing weight matrices via full-rank Singular Value Decomposition and constraining updates, FuRA outperforms unconstrained Full Fine-Tuning while maintaining efficiency.

Optimization deep learning machine learning spectral preconditioning