SGLang DeepSeek-V3-0324
I have been trying to run Deepseek-V3-0324 using instant clusters with 2 x (8 x H100s) and have so far been unsuccessful. I am trying to get the model to run multi-node + multi-gpu.
I have downloaded the model from Huggingface onto a persistent and attach the persistent volume to my instant cluster before launching. After launching, I then run the Pytorch demo script as presented in https://docs.runpod.io/instant-clusters/pytorch to make sure that the network is working (it does).
I then follow the instructions to get Deepseek-V3-0324 running according to: https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3
Instead of following the absolute default instructions and doing:
In its place, I run the following command on each node:
The issue is that this hangs. I check nvidia-smi to see the model loading and it only ever loads each GPU up to almost 1GB before it goes up no further.
Any help would be greatly appreciated.
I have downloaded the model from Huggingface onto a persistent and attach the persistent volume to my instant cluster before launching. After launching, I then run the Pytorch demo script as presented in https://docs.runpod.io/instant-clusters/pytorch to make sure that the network is working (it does).
I then follow the instructions to get Deepseek-V3-0324 running according to: https://github.com/sgl-project/sglang/tree/main/benchmark/deepseek_v3
Instead of following the absolute default instructions and doing:
In its place, I run the following command on each node:
The issue is that this hangs. I check nvidia-smi to see the model loading and it only ever loads each GPU up to almost 1GB before it goes up no further.
Any help would be greatly appreciated.

Learn how to deploy an Instant Cluster and run a multi-node process using PyTorch.

GitHub
SGLang is a fast serving framework for large language models and vision language models. - sgl-project/sglang