Himanshu Kotkar
RRunPod
•Created by Himanshu Kotkar on 5/14/2025 in #⚡|serverless
🚨 Inconsistent Execution Time Across Workers for Same Input on L40s (48GB Pro) – Need Help
Hi everyone,
I'm facing a strange issue with my RunPod endpoint set up using latentsync on L40s 48GB Pro with 10 workers.
The problem is that the same input request is taking vastly different execution times across different workers.
- Some workers complete the task in 10–15 minutes
- Others take up to 1 hour for the exact same input
This inconsistency is severely impacting performance and reliability.
I've ensured that:
- The input is exactly the same
- There are no extra processes or resource-heavy tasks running
- Model/environment is the same across all workers
Has anyone experienced this before? Could it be a hardware-related issue, resource throttling, or something at the container level?
Would really appreciate any insights or help from the community or the RunPod team!
Thanks in advance!
4 replies