← heapsort-ai

Computational Efficiency

10 items

ARTICLE↑ trendingReddit r/MachineLearning·4/22/2026

I built a new category of AI called a Reductive Inference Model (RIM) that answers by elimination instead of generation — AMA [P]

POEM (Process Of Elimination Master) is a novel AI architecture that answers questions by progressively eliminating impossibilities rather than generating possibilities, operating independently of LLMs. It achieves 88% accuracy, is 95.5x faster, and 100x smaller than TinyLlama 1.1B, demonstrating significant computational efficiency.

49
RESEARCHarXiv CS.LG·4/6/2026

Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models

Este trabalho explora o agendamento de modelos para acelerar os Modelos de Linguagem de Difusão Mascarada (MDLMs), substituindo o modelo completo por um menor em certas etapas de denoising. A pesquisa mostra que as etapas iniciais e finais são mais robustas a essa substituição, permitindo uma redução de até 17% nos FLOPs com degradação mínima na perplexidade generativa.

28
RESEARCHarXiv CS.CL·4/13/2026

WAND: Windowed Attention and Knowledge Distillation for Efficient Autoregressive Text-to-Speech Models

WAND introduces a framework to adapt pretrained autoregressive text-to-speech (AR-TTS) models for constant computational and memory complexity. It achieves this by separating attention into global and local sliding-window mechanisms, employing curriculum learning, and utilizing knowledge distillation to maintain high-fidelity speech synthesis with significant KV cache memory reduction.

27
RESEARCHarXiv CS.LG·4/14/2026

Efficient Matrix Implementation for Rotary Position Embedding

This research proposes RoME, a novel and computationally efficient reformulation of Rotary Position Embedding (RoPE), a core component in modern Transformer architectures. By replacing vector-level operations with unified matrix transformations, RoME significantly reduces computational overhead and improves hardware utilization.

27
RESEARCHarXiv CS.LG·5/5/2026

From Euler to Dormand-Prince: ODE Solvers for Flow Matching Generative Models

This research paper systematically benchmarks four classical ODE solvers (Euler, Explicit Midpoint, RK4, Dormand-Prince 5(4)) for Flow Matching generative models, implementing them from scratch in PyTorch. It quantitatively compares their efficiency on tasks from 2D distributions to MNIST, showing RK4 at 80 function evaluations achieves sample quality comparable to Euler at 200, and observes Jacobian eigenvalue spectrum stiffening near t=1.

27
RESEARCHarXiv CS.LG·29d ago

Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models

The Toeplitz MLP Mixer (TMM) is a new transformer-like architecture that replaces attention with triangular-masked Toeplitz matrix multiplication, significantly reducing computational complexity to O(dn log n) time and O(dn) space. TMMs demonstrate superior training efficiency and better input information retention compared to traditional transformers, despite their simpler design.

27