Runpod•17mo ago

vllm seems not use GPU

i'm using vllm
and on the graph, when i launch some request, only cpu usage increase.
if i open a terminal and launch nvidia-smi, i didn't see any process too.

settings line
--model NousResearch/Meta-Llama-3-8B-Instruct --max-model-len 8192 --port 8000 --dtype half --enable-chunked-prefill true --max-num-batched-tokens 6144 --gpu-memory-utilization 0.97

TThibaud i'm using vllm and on the graph, when i launch some request, only cpu usage inc...

Jason•8/13/24, 11:14 AM

Try another pod?

Jason•8/13/24, 11:14 AM

select the cuda version to the right one

JJason Try another pod?

ThibaudOP•8/13/24, 11:16 AM

i tried on 4 different pod.
for cuda version, i don't know where i can set it

Jason•8/13/24, 11:18 AM

btw any logs?

TThibaud i tried on 4 different pod. for cuda version, i don't know where i can set it

Jason•8/13/24, 11:18 AM

not set, more like filter when you create a pod

JJason not set, more like filter when you create a pod

ThibaudOP•8/13/24, 11:20 AM

i m trying pod not serverless.
i don't see where in pod i can filter cuda

Jason•8/13/24, 11:21 AM

Jason•8/13/24, 11:21 AM

just try 12.5

ThibaudOP•8/13/24, 11:21 AM

thanks!

Jason•8/13/24, 11:23 AM

try 12.4 if not

Jason•8/13/24, 11:23 AM

just checked

ThibaudOP•8/13/24, 11:25 AM

i used A40 so 12.4
i ll try with RTX6000 12.5 to check if i see a difference

JJason just checked

ThibaudOP•8/13/24, 11:42 AM

i don't understand why i don't see any processes here

Jason•8/13/24, 11:51 AM

Huh

Jason•8/13/24, 11:52 AM

Does it means in maintanance?

TanegashimaGunsmith•11/13/24, 8:25 AM

Hi, was this issue solved? I have the same problem with the latest Pytorch and Cuda, as well. I also reset my pod, etc. but CPU is at 100%, GPU utilisation is low, and I have no processes showing up in nvidia-smi

Jason•11/15/24, 5:18 AM

I think the official vllm works well with the gpu

Jason•11/15/24, 5:18 AM

It is using like most of the gpu like normal setup

JJason I think the official vllm works well with the gpu

Jason•11/15/24, 5:18 AM

In runpod serverless

TanegashimaGunsmith•11/15/24, 7:53 AM

What do you mean by the official vllm? I'm installing it via pip

TanegashimaGunsmith•11/15/24, 7:53 AM

I have a pod

Jason•11/15/24, 9:07 AM

Ohh I thought it's serverless

Jason•11/15/24, 9:20 AM

What image do you use

Jason•11/15/24, 9:23 AM

It might be, Cuda, package that connects to Cuda

Jason•11/15/24, 9:23 AM

If you try other pod and it works, it's because the old pod is bad

TanegashimaGunsmith•11/15/24, 11:59 AM

I already reset the pod, it doesnt seem to be that
runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04

Jason•11/15/24, 12:35 PM

Did you try on another pod?

Jason•11/15/24, 12:36 PM

How did you setup vllm btw

Jason•11/15/24, 12:36 PM

Install*

TanegashimaGunsmith•11/15/24, 2:14 PM

pyenv virtualenv, then pip install vllm

TanegashimaGunsmith•11/15/24, 2:15 PM

And yes, I deleted the pod and started a new one

TanegashimaGunsmith•11/15/24, 2:15 PM

Or do you mean a pod with a different pytorch version?

Jason•11/15/24, 2:27 PM

Oh is it still the same?

Jason•11/15/24, 2:27 PM

Another pod create another one

TanegashimaGunsmith•11/15/24, 2:36 PM

Yes, after deleting I still have this problem

Jason•11/15/24, 3:06 PM

hmm okay that is not a pod problem then

Jason•11/15/24, 3:06 PM

let me try and see

TTanegashimaGunsmith Hi, was this issue solved? I have the same problem with the latest Pytorch and C...

Jason•11/15/24, 3:07 PM

Did you load a model yet btw?

Jason•11/15/24, 3:11 PM

let me try using jupyter and install vllm i'll share how it goes

Jason•11/15/24, 3:26 PM

it seems like it is using gpu

Jason•11/15/24, 3:26 PM

but not showing in nvidia-smi

JJason Did you load a model yet btw?

TanegashimaGunsmith•11/15/24, 3:48 PM

Yes, the same issue with a new pod

JJason it seems like it is using gpu

TanegashimaGunsmith•11/15/24, 3:49 PM

100% CPU and ~50% GPU or 100% GPU?

JJason but not showing in nvidia-smi

TanegashimaGunsmith•11/15/24, 3:49 PM

I don't see a process there either, neither for lmdeploy

TanegashimaGunsmith•11/15/24, 3:49 PM

but it does show up in nvtop

Jason•11/15/24, 4:20 PM

Most normal use maximizing gpu use according to the vllm config

Jason•11/15/24, 4:20 PM

The performance is normal for the model and gpu right?

Jason•11/15/24, 4:20 PM

Like tokens throughput

Jason•11/15/24, 4:20 PM

I'm editing a short video showing the performance and gpu usage I can send it tommorow

vllm seems not use GPU

Similar Threads

Similar Threads

Similar Threads