DOC↑ trending42

Run Qwen3.5-397B-A13B with vLLM and 8xR9700

Reddit r/LocalLLaMA·April 11, 2026

This document details the optimized execution of the Qwen3.5-397B-A17B-MXFP4 model using vLLM on RDNA4 GPUs, such as 8xR9700. It provides a Dockerfile with Triton patches and instructions for downloading the model and launching the inference container.

Docker GPU MXFP4 Qwen vLLM

Read original ↗