← heapsort-ai

LLM inference

11 items

DOCDEV.to AI·26d ago

Laravel Horizon in Production: Configuring AI Queue Workloads That Actually Hold

This guide addresses the challenges of configuring Laravel Horizon for AI inference workloads in production, where standard queue job defaults fail due to the extended processing times of LLMs. It explains how to prevent silent timeouts and job failures that occur when Horizon's default settings are not adapted for long-running AI tasks.

27
RESEARCHarXiv CS.LG·4/6/2026

Characterizing WebGPU Dispatch Overhead for LLM Inference Across Four GPU Vendors, Three Backends, and Three Browsers

Este estudo caracteriza a sobrecarga de despacho do WebGPU para inferência de LLM em diversas plataformas de GPU, backends e navegadores. Ele revela que benchmarks simples superestimam os custos e identifica o verdadeiro custo por despacho da API WebGPU, destacando a necessidade dessa distinção para otimizações eficazes.

27