TTL for vLLM endpoint
Terminating local vLLM process while loading safetensor checkpoints
Can we set public-read with rp_upload?
ExtraArgs={'ACL': 'public-read'}
ExtraArgs={'ACL': 'public-read'}
Error starting container - cpu worker
Training Flux Schnell on serverless
Training flux-schnell model
Creation of a Unhealthy worker on startup, the worker runs out of memory on Startup.
Streaming support in local mode
Creating endpoint through runpodclt
Jobs in queue for a long time, even when there is a worker available

status: "IN_QUEUE" , what can be the issue
Getting slow workers randomly
hv5rbk09kzckc9
id takes around 11-12 seconds to execute the same exact comfy workflow with same gpu whereas the other worker with id lgmvs58602xe61
takes 2-3 seconds to execute.
When we get a slow worker, it's just slower on every aspect. GPU inference takes 5x longer. Comfy import times take 7-8x longer than a normal worker....
Collecting logs using API
Problems with serverless trying to use instances that are not initialized

Upgrading VLLM to v0.6.0
Active workers or Flex workers? - Stable Diffusion
I shouldn't be paying for this

Offloading multiple models
Increase Max Workers
generativelabs/runpod-worker-a1111 broken