← heapsort-ai

Transformer

10 items

ARTICLE↑ trendingReddit r/MachineLearning·4/23/2026

Optimizing Transformer model size & inference beyond FP16 + ONNX (pruning/graph opt didn’t help much) [P]

The user is optimizing a Transformer model for size and inference speed, having plateaued after FP16 conversion and ONNX optimization, with pruning yielding limited gains. They are seeking advice on advanced techniques like low-rank factorization, aggressive quantization (INT8/INT4), knowledge distillation, or hardware-specific optimizations to achieve further real-world improvements.

50
ARTICLEDEV.to AI·4/10/2026

"Attention Is All You Need" Paper tahun 2017 yang mengubah dunia kecerdasan buatan, dijelaskan tanpa perlu latar belakang teknis.

O artigo explora a importância do paper 'Attention Is All You Need' de 2017, que revolucionou a IA ao introduzir a arquitetura Transformer, base de modelos como ChatGPT. Ele detalha como essa inovação superou as limitações das redes neurais recorrentes, permitindo que computadores compreendam e gerem linguagem humana com maior eficiência.

28