Ergin Bilgin
RRunPod
•Created by Ergin Bilgin on 11/1/2024 in #⚡|serverless
Llama-3.1-Nemotron-70B-Instruct in Serverless
Hello there,
I've been trying to deploy Nvidia's Llama-3.1-Nemotron-70B-Instruct in serverless using vLLM template but I could not get it work no matter what.
I'm trying to deploy it using an endpoint using 2 x H100 GPUs, but in my most attempts I don't even see weights being downloaded. Requests start and after few minutes worker terminates.
In this scenario I get error:
Unrecognized model in nvidia/Llama-3.1-Nemotron-70B-Instruct. Should have a
model_type key in its config.json, or contain one of the following strings in its name: albert, align, altclip, audio-spectrogram-transformer, autoformer, bark, bart, (and list goes on)
Even weirder is that I deploy exact same configuration again but sometimes it downloads the weights and then does not work with different errors each time. It's not consistent.
In fact, I tried few other popular 70B models but couldn't get any of them work.
Has anybody tried and managed to run 70B models in serverless so far?3 replies
RRunPod
•Created by Ergin Bilgin on 1/5/2024 in #⛅|pods-clusters
How can I clean up storage in my network volume?
Hello, I'm using stable diffusion template with a network volume. I noticed that even though I clean up files in Jupyter, space is not freed up in my volume. I suspect files go to trash but not removed completely. I searched a lot but could not find the trash folder. Does anybody know where I can find or any other way of cleaning up my storage space properly?
4 replies