Search
Star
Feedback
Setup for Free
© 2026 Hedgehog Software, LLC
Twitter
GitHub
Discord
System
Light
Dark
More
Communities
Docs
About
Terms
Privacy
Too many failed requests - Runpod
R
Runpod
•
2y ago
•
6 replies
norefreshing
Too many failed requests
Hello
. I
've tried to run casperhansen
/mixtral
-instruct
-awq
(
https://huggingface.co/casperhansen/mixtral-instruct-awq
) on A100 80 GB and A100 SXM 80GB GPUs
, sending 10 requests per second using this script
https://github.com/vllm-project/vllm/blob/main/benchmarks/benchmark_serving.py
.
However most of the requests failed with
Aborted request
Aborted request
log from vLLM
. This issue didn
't occur on another platform with the same GPU
, and same code
, so I
'm not sure if the problem is with vLLM or with RunPod
's internal processing
.
Could anyone provide guidance on what the cause might be
?
casperhansen/mixtral-instruct-awq · Hugging Face
Solution
Why are you using GPU cloud for this
? If you want to handle many concurrent requests
, you need to use Serverless not GPU cloud
.
https://github.com/runpod-workers/worker-vllm
GitHub
GitHub - runpod-workers/worker-vllm: The RunPod worker template for...
The RunPod worker template for serving our large language model endpoints
. Powered by vLLM
.
- runpod
-workers
/worker
-vllm
Jump to solution
Runpod
Join
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!
21,202
Members
View on Discord
Resources
ModelContextProtocol
ModelContextProtocol
MCP Server
Recent Announcements
Similar Threads
Was this page helpful?
Yes
No
Similar Threads
Docker image pull error: too many requests
R
Runpod / ⛅|pods
2y ago
"Too many open files in system"
R
Runpod / ⛅|pods
2y ago
Too many open files on GPU pod A6000
R
Runpod / ⛅|pods
6mo ago
Too many Open Files Error on CPU Pod - Easy Repro
R
Runpod / ⛅|pods
2y ago