Issues in SE region causing a massive amount of jobs to be retried

The issues in the screenshot are causing 10% of my jobs to be retried in SE region. Please fix this, its not happening in CA region.
No description
20 Replies
digigoblin
digigoblinOP2y ago
Obviously I am referring to the "Connection timeout" errors which causes the job results to fail to be returned, and not the single exeption among them.
Madiator2011
Madiator20112y ago
@digigoblin DO YOU MIND SUBMITING AS TICKET ON WEBSITE EASIER TO ESCALATE
digigoblin
digigoblinOP2y ago
No need to shout but sure 😁
Madiator2011
Madiator20112y ago
ups sorry for caps
digigoblin
digigoblinOP2y ago
Ticket number is 4208
Madiator2011
Madiator20112y ago
done
digigoblin
digigoblinOP2y ago
Thank you
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
digigoblin
digigoblinOP2y ago
You probably didn't try and send 1000 jobs today
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
digigoblin
digigoblinOP2y ago
I said 10% are retried NOT ALL 🤦‍♂️
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
digigoblin
digigoblinOP2y ago
They are retried they don't fail
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
digigoblin
digigoblinOP2y ago
RunPod needs to check it out, I switched to CA in the meantime and it works fine without any issues.
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
digigoblin
digigoblinOP2y ago
I was using CA but then switched to SE because my jobs were failing, but it was actually because my own Redis server had OOM issues due to running out of memory and wasn't a RunPod issue. So I upgraded my ElastiCache instance on AWS from cache.t3.medium to cache.m4.large and now its fine.
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View
digigoblin
digigoblinOP2y ago
Because its a cluster not a single instance
Unknown User
Unknown User2y ago
Message Not Public
Sign In & Join Server To View

Did you find this page helpful?