R
Runpod13mo ago
Thibaud

vllm seems not use GPU

i'm using vllm and on the graph, when i launch some request, only cpu usage increase. if i open a terminal and launch nvidia-smi, i didn't see any process too. settings line --model NousResearch/Meta-Llama-3-8B-Instruct --max-model-len 8192 --port 8000 --dtype half --enable-chunked-prefill true --max-num-batched-tokens 6144 --gpu-memory-utilization 0.97
No description
36 Replies
Unknown User
Unknown User13mo ago
Message Not Public
Sign In & Join Server To View
Thibaud
ThibaudOP13mo ago
i tried on 4 different pod. for cuda version, i don't know where i can set it
Unknown User
Unknown User13mo ago
Message Not Public
Sign In & Join Server To View
Thibaud
ThibaudOP13mo ago
i m trying pod not serverless. i don't see where in pod i can filter cuda
Unknown User
Unknown User13mo ago
Message Not Public
Sign In & Join Server To View
Thibaud
ThibaudOP13mo ago
thanks!
Unknown User
Unknown User13mo ago
Message Not Public
Sign In & Join Server To View
Thibaud
ThibaudOP13mo ago
i used A40 so 12.4 i ll try with RTX6000 12.5 to check if i see a difference
Thibaud
ThibaudOP13mo ago
i don't understand why i don't see any processes here
No description
Unknown User
Unknown User13mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
Hi, was this issue solved? I have the same problem with the latest Pytorch and Cuda, as well. I also reset my pod, etc. but CPU is at 100%, GPU utilisation is low, and I have no processes showing up in nvidia-smi
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
What do you mean by the official vllm? I'm installing it via pip I have a pod
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
I already reset the pod, it doesnt seem to be that runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
pyenv virtualenv, then pip install vllm And yes, I deleted the pod and started a new one Or do you mean a pod with a different pytorch version?
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
Yes, after deleting I still have this problem
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
Yes, the same issue with a new pod 100% CPU and ~50% GPU or 100% GPU? I don't see a process there either, neither for lmdeploy but it does show up in nvtop
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
No, tokens per second is very low for me (10-12 tps) Thanks for the video!
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
A100 gpu and Qwen2.5 32B Instruct I'm starting to think it has something to do with the JSON output I'm generating, maybe
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
I have deleted and recreated it, and it's the sane
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
I tried it on an Ada 6000 or so, and it had regular performance which is why I'm surprised On a different hoster
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
I didn't use a A6000 on runpod, but on Hetzner In the meantime I have setup LMDeploy and it seems to fully utilise the A100 on runpod So, it might just be vLLM that's broken
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
Huh, okay There are issues on Github as well, where people are stuck on 100% CPU but low GPU utilisation No solutions there either
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View
TanegashimaGunsmith
In the vllm github At least LMDeploy seems to work for now Thank you!
Unknown User
Unknown User10mo ago
Message Not Public
Sign In & Join Server To View

Did you find this page helpful?