← heapsort
DOC↑ trending42

Run Qwen3.5-397B-A13B with vLLM and 8xR9700

Reddit r/LocalLLaMAΒ·April 11, 2026

This document details the optimized execution of the Qwen3.5-397B-A17B-MXFP4 model using vLLM on RDNA4 GPUs, such as 8xR9700. It provides a Dockerfile with Triton patches and instructions for downloading the model and launching the inference container.

Read original β†—