RunpodR
Runpod9mo ago
ErezL

I am trying to deploy a "meta-llama/Llama-3.1-8B-Instruct" model on Serverless vLLM

I do this with maximum possible memory.
After setup, I try to run the "hello world" sample, but the request is stuck in queue and I get "[error]worker exited with exit code 1" with no other error or message in log.
Is it even possible to run this model?
What is the problem? can this be resolved?
(for the record, I did manage to run a much smaller model using the same procedure as above)
Was this page helpful?