Hi, I was wondering if there was a simple way to run Qwen3 235B Q5_K_M using vLLM on RunPod.
I have two main issue: 1) the Qwen3 235B GGUF repo contains multiple quantizations (e.g., Q6_K, Q5_K_M, Q5_0), and I don't know how to select one 2) my understanding from vLLM's documentation is that I have to combine the GGUF files before serving them