A40 Throttled very regularly!

SSH info via cli
Can not get a single endpoint to start
All 16GB VRAM workers are throttled in EU-RO-1

worker-vllm: Always stops after 60 seconds of streaming
api.runpod.ai/v2. This has benefits of being able to get the job_id and do more things, but I would like to do this with the OpenAI API....It is always getting queued whenever I call API queue always get bigger, how to cancel all jobs
I want to deploy a serverless endpoint with using Unsloth
--trust-remote-code
trust_remote_code=True in LLM or using the --trust-remote-code flag in the CLI.; <traceback object at 0x7fecd5a12700>;"...Is there any "reserve for long" and "get it cheaper" payment options?
llvmpipe is being used instead of GPU
1s delay between execution done and Finished message

Serverless is Broken

EU-RO-1 region severless H100 gpu not available ....
Workers wrongfully reported as "idle"

"Throttled" and re-"Initializing" workers everywhere today
how to run flux+lora on 24 GB Gpu through code
Queue waiting 5+ minutes with dozens of idle workers
Serverless H200?
using compression encoding for serverless requests