R
Runpod3mo ago
Hleb J

Is US-KS-2 dead?

Like 3 days a row pods on US-KS-2 working awful. Running on PyTorch 2.4, 2.8, doesn’t matter. Like 10 minutes to load Jupyter through 8888
19 Replies
Dj
Dj3mo ago
This datacenter looks fine, can you tell me about the issue you're having or share a Pod ID I can take a look?
Pseudoface
Pseudoface3mo ago
I'm having the same issue, several days in a row. Pod works but extremely slow
Dj
Dj3mo ago
I cannot help without a pod or worker id in this datacenter, all I can say otherwise is this datacenter has no known incidents and looks fine looking at the combined metrics.
Hleb J
Hleb JOP3mo ago
8hceyd0mo3l78z, for example, or bwwak89cnrnba5, just try to start both
Dj
Dj3mo ago
That helps, one sec This template in this datacenter works for me, Jupyter will only start if an environment variable called JUPYTER_PASSWORD is defined when the Pod starts. I will note that I had to turn off my Adblocker after going to the Jupyter URL. @Hleb J I can't see if you have env variables defined for privacy reasons, but that's the only thing I can think of.
Hleb J
Hleb JOP3mo ago
Variable is defined. Problem is not just about Jupyter, problem with speed of loading data in datacenter. Start of 8hce… took 13 minutes to load template . Pod with EUR-IS-1, same template, works well, quick start, all folders accessible. But US-KS-2 in other tab says “the loading screen taking a long time…”. With web terminal folders accessible, but everything is slow. Like run comfy server on eur-is-1 takes less than 10 minutes, 5 more - and all models loaded, everything works perfectly. On US-KS-2, comfy with same nodes, near 40 minutes just to run comfy and after hour of waiting of models loading I’ve terminated pod. Sometimes it works normal, but then problem returns. And this problem definitely not with Adblock or internet, it’s internal problem of US-KS-2.
Tenofas
Tenofas3mo ago
Yes, same here... it looks like it's only a US-KS-2 problem. I am conectring from Italy...
redparis
redparis3mo ago
Spinning up one right now just to test: qwhsqs5v5whcda And according to the logs its just hanging
Dj
Dj3mo ago
We've identified the problem, it's not something I would've been able to discover on my own so thank you all for reporting issues :fbslightsmile: I'll be back when I have details from our Site Reliability Team
redparis
redparis3mo ago
Thanks...I'm going to shut down qwhsqs5v5whcda then
Yldcherry
Yldcherry3mo ago
Same here, but also on EU-SE-1 this morning. Locks up solid
Dj
Dj3mo ago
US-KS-2 should be good to go. @redparis, @Tenofas, @Hleb J The datacenter made a change to their network configuration which caused the problems you saw, we've rolled it back.
Tenofas
Tenofas3mo ago
I am testing it right now... does not look fixed. It is still slow, very slow.... do we have to do something our side? Cleaning cache or other stuff?
Dj
Dj3mo ago
I don't have the exact timeline, but I am aware that we are currently working with the datacenter on a repair.
minjunes
minjunes3mo ago
Same here, not US-KS-2 has frequent spikes (every 10-15 minutes) where network stalls and is unusable. Tested on A100s and H100s
redparis
redparis3mo ago
Going to have to delete my network storage there...this is just kind of ridiculous that it can go on for this long without a fix. Wasted money
minjunes
minjunes3mo ago
Same This is actually the second time for me, I moved once from another region because the network was so shit
Tenofas
Tenofas3mo ago
At the end I moved to a different datacenter a few days ago... it was fine on the new datacenter till yesterday. today even EUR-IS-1 is slowing down... 😕
Aron
Aron3mo ago
dont move datacenters.. its sometimes happening but will recover

Did you find this page helpful?