Runpod

R

Runpod

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

Join

⚡|serverless

⛅|pods

🔧|api-opensource

📡|instant-clusters

🗂|hub

value not in list on serverless

i have a network storage setup with comfyui that i use to deploy pods on, now i want to use that storage with serverless. i followed the guide to https://github.com/runpod-workers/worker-comfyui/blob/main/docs/customization.md , tried method 2 and created an endpoint with the runpod/comfyui-worker:5.5.0-base and set the network storage to this endpoint. when trying to use a simple workflow (flux1-dev) on serverless that perfectly works when connected with a pod, i get a error "value not in list...
Solution:
For the records: In my network storage the models from the comfyui setup to run a pod are saved in /(workspace)/ComfyUI/models/.. but the serverless worker is looking at /(runpod-volume)/models/.. , putting models there fixed the "value not in list" error on serverless. it was a matter of not reading the docs carefully enough on my side, its mentioned in the very bottom note @ https://github.com/runpod-workers/worker-comfyui/blob/main/docs/customization.md: "Note: When a Network Volume is correctly attached, ComfyUI running inside the worker container will automatically detect and load models from the standard directories (/workspace/models/...) within that volume....

Finish task with error: CUDA error: no kernel image is available for execution on the device

I often get this error for any of my requests. And I am more than sure that this is a Runpod problem. How can I fix this?

Runpod underwater?

Hey everyone, First message, prompted by what I've been seeing for a while at the website. I'm launching a new project soon and I've been tracking availability of some GPUs in Europe. It suddenly went through the floor! EU-RO-1 disappeared altogether, for instance......

Failing requests

Hey, all of my serverless endpoint requests are failing. I’ve tried every available GPU config (24GB, 24GB PRO, 32GB PRO, 80GB PRO, 141GB) and they all show “High Supply”, but nothing is processing, the entire service is effectively down. This has been going on for a while now and there’s zero communication. If there’s an outage or scaling issue, please just say it, so we can stop waiting and plan accordingly. Can someone from the team confirm what’s happening and whether there’s an ETA on a fix?...
No description

A deposit error caused me to lose money, any Runpod staff providing support?

I deposited USDT from Crypto.com to Runpod — the transaction was completed on Crypto.com’s side, but I still haven’t received it on Runpod.

So whats the deal with all the issues and the charging for failed docker fetches?

This has been going for over 2 days now, where is the statement? I'm kinda losing all the trust i had built up with runpod. The biggest issue are the charges that you have to notice and stop manually on failed docker fetches, thats an absolute no-go. Issues:...

Having trouble using ModelPatchLoader in Comfyui Serverless

I've been trying to get my workflow to run on comfyui and I'm getting stuck with an error that it's returning to me. I am not using any custom nodes, just base level nodes that are included with Comfyui. My workflow contains the ModelPatchLoader node that looks for models in the model_patches folder. I've added the folder and model to my network volume and have verified that they are there. But I keep getting this response from the endpoint: {'delayTime': 618, 'error': 'Workflow validation failed:\n• Node 39 (errors): [{\'type\': \'value_not_in_list\', \'message\': \'Value not in list\', \'details\': "name: \'uso-flux1-projector-v1.safetensors\' not in []", \'extra_info\': {\'input_name\': \'name\', \'input_config\': [[], {}], \'received_value\': \'uso-flux1-projector-v1.safetensors\'}}]\n• Node 39 (dependent_outputs): [\'9\']\n• Node 39 (class_type): ModelPatchLoader', 'executionTime': 323, 'id': 'sync-445ef416-ddcf-4a2c-bba7-fd6bc9e192a3-u1', 'status': 'FAILED', 'workerId': 'aehi12zofrz99h'} I'm wondering if anyone has any insights into this?...

I have some questions about Serverless scaling and account limits

I have some questions about Serverless scaling and account limits: 1. What is the current maximum workersMax limit for Serverless endpoints? 2. Is there an overall account-level quota that limits total concurrent workers across all endpoints? 3. If so, is it possible to request a higher limit for production workloads?...

GPU Detection Failure Across 20–50% of Workers — Months of Unresolved Issues

Hey. This is becoming ridiculous. I’ve been having recurring issues with your platform for months now, and things are only getting worse. I’ve already sent emails, opened multiple Discord threads, and every time it ends the same way — someone acknowledges there’s a problem (“ah yes, we have an issue”), and then support completely disappears. No follow-up, no fix....
No description

Question about Serverless max workers, account quota limits, and using multiple accounts

Hi RunPod 👋 I have some questions about Serverless scaling and account limits: 1. What is the current maximum workersMax limit for Serverless endpoints?...

How to use Python package for public endpoints?

https://github.com/runpod/runpod-python Does anyone know if we can use this runpod-python package to connect to public endpoints too? I'm specifically looking to connect to the wan 2.2 endpoint. The example in the API playground for python just returns only the id and status....

What are some good analogues of Runpod Serverless?

Because the original serverless has not been working for 2 days already

Finish task with error: CUDA error: no kernel image is available for execution on the device

I get this error always in all my workers now:
3czrvanpdpzxz3[error] [31m[2025-10-14 19:48:25] ERROR [0m [34m[Task Queue] Finish task with error: CUDA error: no kernel image is available for execution on the device\n
3czrvanpdpzxz3[error] [31m[2025-10-14 19:48:25] ERROR [0m [34m[Task Queue] Finish task with error: CUDA error: no kernel image is available for execution on the device\n
Also, it is after gpu increasing (was 24gb, now 32). ...

Throttling on multiple endpoints and failed workers

All of our endpoints with RTX 4090 workers are fully throttled, some with over 100+ workers. There is no incident report or any update here or the status page. Workers consiostently come up and get stuck in loading the image and to top it all they are in the executing state and charge the account.

Ongoing Throttling Issues with Multiple Serverless Endpoints

Hey guys, hi! I'm having ongoing throttling issues with several serverless endpoints in Runpod (thfv8pa98n0zmx, 3uo2k0k7717auu, 9o42o47k1v1wn)—they've been stuck for the second day now and it's disrupting work. Which section/channel should I post a detailed support request to get a quick response?

Serverless throttled

Hi! Since yesterday I can't run my serverless endpoint - I'm constantly being throttled or given unhealthy workers. Can we do something to make it work?
Solution:
I believe I've spoken to all of you in a mixture of other threads and in the general channel - but sharing this for visibility: Throughout this week we've been running emergency maintenance and the users most affected are those running serverless workloads with popular GPUs. Where we may have a surplus of a specific GPU, we have to delist the machines that host the GPUs (where it's up to 8 GPUs per machine) to perform work on them. We are obligated to perform this maintenance across the fleet and only ask for your patience until it's done and we can disclose the reason....

Huggingface cached models seems to not working

i`m added repo but no models inside container and no logs
No description

vLLM jobs not processing: "deferring container creation"

We just noticed there are 2000+ jobs waiting in our queue and no jobs in progress. I'm getting super-frustrated with Serverless. In the logs I see this message: "deferring container creation: waiting for models to complete: [meta-llama/llama-3.3-70b-instruct]" I just terminated a few workers hoping that they would start back up and work again, but can someone help me figure out how to resolve this? Why are my workers not processing jobs (which has been working mostly ok for a couple of weeks now with no changes)...