← heapsort
NEWS↑ trending48

INT3 compression+fused metal kernels [R]

Reddit r/MachineLearningΒ·April 22, 2026

A solo founder developed INT3 model compression and a 2-bit KV cache with custom fused Metal kernels for Mac (M-series). Qwen 7B is available in preview, and further optimizations and GPU support are planned.

Read original β†—