DOCβ trending42
Run Qwen3.5-397B-A13B with vLLM and 8xR9700
Reddit r/LocalLLaMAΒ·April 11, 2026
This document details the optimized execution of the Qwen3.5-397B-A17B-MXFP4 model using vLLM on RDNA4 GPUs, such as 8xR9700. It provides a Dockerfile with Triton patches and instructions for downloading the model and launching the inference container.
Read original β