← heapsort-ai

Optimization

134 items

RESEARCHarXiv CS.LG·4/27/2026

Focus Session: Hardware and Software Techniques for Accelerating Multimodal Foundation Models

This research presents a multi-layered methodology to accelerate multimodal foundation models (MFMs) through hardware and software co-design. It employs optimization techniques like hierarchy-aware mixed-precision quantization, structural pruning, speculative decoding, and model cascading to reduce computational and memory requirements.

27
RESEARCHarXiv CS.LG·4/17/2026

TOPCELL: Topology Optimization of Standard Cell via LLMs

TOPCELL is a novel framework that uses Large Language Models (LLMs) to optimize transistor topology in standard cell design, overcoming the limitations of traditional exhaustive search methods. By reformulating topology exploration as a generative task and employing GRPO for fine-tuning, it significantly improves the discovery of routable and physically-aware layouts for advanced technology nodes.

27
ARTICLEDEV.to AI·29d ago

When I started running models locally, I thought quantization meant squeezing more into RAM. Turns o

The article advises against defaulting to Q4_K_M for local LLM inference, emphasizing that optimal performance comes from testing quantization levels tailored to specific workflows. It suggests that aggressive quantization like Q3_K_S can significantly cut latency with imperceptible quality loss for many tasks, though context length presents a trade-off.

27
RESEARCHarXiv CS.LG·4/20/2026

Mapping High-Performance Regions in Battery Scheduling across Data Uncertainty, Battery Design, and Planning Horizons

This study presents a triadic analysis of battery scheduling under multi-stage model predictive control, investigating the interplay between data characteristics, forecast uncertainty, planning horizon, and battery c-rate. It identifies an "effective horizon" for optimal look-ahead length, enabling reduced computational costs and providing practical guidance for industrial storage operations.

27
RESEARCHarXiv CS.LG·5/4/2026

Information-Theoretic Generalization Bounds for Stochastic Gradient Descent with Predictable Virtual Noise

This paper introduces predictable history-adaptive virtual perturbations to enhance information-theoretic generalization bounds for Stochastic Gradient Descent. This new approach allows perturbation covariances to dynamically depend on past SGD history, addressing limitations of existing methods that require fixed covariances.

27
RESEARCHarXiv CS.LG·4/21/2026

Matched-Learning-Rate Analysis of Attention Drift and Transfer Retention in Fine-Tuned CLIP

This paper investigates how adaptation methods (Full FT vs. LoRA) and optimization scale jointly shape attention drift and transfer retention in fine-tuned CLIP models. A controlled matched-learning-rate comparison reveals that the learning rate strongly modulates structural change, with Full FT showing marked contraction at higher rates while LoRA remains entropy-positive.

27
RESEARCHarXiv CS.LG·5/7/2026

Lookahead Drifting Model

This paper proposes a "lookahead drifting model" for distribution mapping, which enhances image generation performance via one-step neural functional evaluation. The model computes a set of drifting terms sequentially at each training iteration, utilizing positive samples and model outputs to capture higher-order gradient information.

27
RESEARCHarXiv CS.AI·8d ago

Position Paper: Post-Solve Robustness in Decision Engines: Feasible Regions and Smoothness Under Perturbations

This paper introduces a missing layer in optimization pipelines to address the post-solve robustness gap in Mixed-Integer Linear Programming (MILP) decision engines. It formalizes an epsilon-near-optimal feasible neighborhood and solution smoothness to assess how far a solved incumbent can be trusted under parameter perturbations.

27
RESEARCHarXiv CS.LG·18d ago

DualOptim+: Bridging Shared and Decoupled Optimizer States for Better Machine Unlearning in Large Language Models

DualOptim+ is a novel optimization framework designed to improve machine unlearning in large language models by bridging shared and decoupled optimizer states. It uses base states for common representations and delta states for objective-specific residuals, also offering a quantized 8-bit variant to reduce memory overhead without compromising performance.

27
RESEARCHarXiv CS.LG·28d ago

QuIDE: Mastering the Quantized Intelligence Trade-off via Active Optimization

QuIDE introduces a unified metric, the Intelligence Index I, to evaluate the efficiency of quantized neural networks by collapsing the compression-accuracy-latency trade-off. Experiments across various settings identify task-dependent optimal quantization (4-bit or 8-bit), providing a reproducible evaluation protocol and a fitness function for mixed-precision search.

27
RESEARCHarXiv CS.LG·16d ago

FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning

This research introduces FuRA (Full-Rank Adaptation), a novel parameter-efficient fine-tuning method that addresses limitations in existing techniques by incorporating spectral preconditioning. By reparameterizing weight matrices via full-rank Singular Value Decomposition and constraining updates, FuRA outperforms unconstrained Full Fine-Tuning while maintaining efficiency.

27