DOC↑ trendingReddit r/LocalLLaMA·4/15/2026
Qwen3.5-35B running well on RTX4060 Ti 16GB at 60 tok/s
The author shares a successful optimization for running the Qwen3.5-35B-A3B-UD-Q4_K_L model on an RTX 4060 Ti 16GB using llama.cpp, achieving 40-60 tokens/s with 64k context. The post provides the detailed `models.ini` configuration and server start command to replicate this performance.
42