R
Runpodβ€’4mo ago
crystal

Slow startup times

Has anyone experienced really variable startup times? Loading comfyui today took 45+ minutes when it usually takes 1-2 minutes. Also, jupyterlab in general has been laggy / not responsive. Yesterday was working just fine, so not sure if that's just me.
187 Replies
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
crystal
crystalOPβ€’4mo ago
EU-RO-1
siytek
siytekβ€’4mo ago
@crystal not just you, I am also experiencing exactly the same issues with EU-RO-1, just joined here to see if anyone else was having trouble Jupyter very laggy and keeps 'sticking,' echoing back some seconds later, everything seems very slow/unusable when usually its very responsive. I also noticed the pod local storage get to 107% at one stage, despite the fact that I don't do anything outside of /workspace. As usual I am running runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04 with network storage, never usually have any issues with it.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
Poddy
Poddyβ€’4mo ago
@crystal
Escalated To Zendesk
The thread has been escalated to Zendesk!
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
skyrrr
skyrrrβ€’4mo ago
same here
baldo
baldoβ€’4mo ago
same, EU-RO-1 as well disk I/O seems to be extremely slow i think? python3 -m venv venv took about a minute to finish and libraries are taking forever to load:
Testing torch import...
torch imported successfully in 70.69s
torch version: 2.7.1+cu128
Testing transformers import...
transformers imported successfully in 99.36s
transformers version: 4.52.4
Testing torch import...
torch imported successfully in 70.69s
torch version: 2.7.1+cu128
Testing transformers import...
transformers imported successfully in 99.36s
transformers version: 4.52.4
octimot
octimotβ€’4mo ago
Same for us β€” running multiple pods with network storage on EU-RO-1 and they are extremely slow. ComfyUI takes 10 to 30 minutes to start and we constantly get cloudfare timeouts, and it's the instance is pretty much inaccessible. I've opened a ticket https://contact.runpod.io/hc/en-us/requests/19551 yesterday and looking to see if I can provide more relevant info. Hope this gets solved! πŸ€“
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
VK
VKβ€’4mo ago
Facing the same issue
octimot
octimotβ€’4mo ago
The container volumes work at expected speeds though, it seems that it's only network volume related
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
πŸ˜…
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
happy to check for you
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
yea!!! ComfyUI is pretty much unusuable especially if you have a large install with multiple custom nodes
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
exotic_designs
exotic_designsβ€’4mo ago
too much slow on RTX 4000 Ada EU-RO-1 even unable to start comfyui
octimot
octimotβ€’4mo ago
It seems that this is affecting serverless workers using a different network storage on EU-RO-1 Our serverless workers are no longer starting up due to timeouts
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
That would mean moving around 1 TB of data from different network volumes and reinstalling multiple containers
exotic_designs
exotic_designsβ€’4mo ago
i am using a permanently mounted dis, but it stucks
No description
skyrrr
skyrrrβ€’4mo ago
How is it possible that nothing happen, it's been almost 24H
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
I don't think Runpod is aware of this since no incident has been logged: https://uptime.runpod.io/ But there's clearly an issue here
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
I'd be happy to provide more info if needed
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
Anything that's needed by staff for debugging on our side
yhlong00000
yhlong00000β€’4mo ago
is issue still going on, can someone provide me some pod ids, I will check with infra team.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
exotic_designs
exotic_designsβ€’4mo ago
@yhlong00000 here is mine: zafzxwm66rcvy8 Same issue
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
Yes, having issues with: 4n17vfvlfgxuwj Also, serverless worker: 1kxd62zyws4zq0 The pod 4n17vfvlfgxuwj took around 20 minutes to boot ComfyUI from /workspace/ComfyUI but I could not connect to it after it booted Trying to deploy another one to see how it behaves
yhlong00000
yhlong00000β€’4mo ago
i've just run a speedtest on the machine, the network is good. can you give me a screenshot what you doing is slow?
No description
octimot
octimotβ€’4mo ago
For e.g. serverless worker n85qs8g9swe0ed is currently trying to initialize comfyUI from /workspace/ComfyUI (network storage)
skyrrr
skyrrrβ€’4mo ago
same here, my comfyui backend is loading forever and not booting
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
Again, the issue is with network volumes
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
skyrrr
skyrrrβ€’4mo ago
I use storage network
yhlong00000
yhlong00000β€’4mo ago
network volume speed is slow?
octimot
octimotβ€’4mo ago
Yes!! πŸ˜„
yhlong00000
yhlong00000β€’4mo ago
got it, let me run some test
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
exotic_designs
exotic_designsβ€’4mo ago
ON MY END network seems good but its not booting comfyui taking too much time to boot seems like issue is in volume!
No description
octimot
octimotβ€’4mo ago
I'm connected to two separate teams running services on different network volumes on EU-RO-1 and they both fail both normal pods and serverless workers the speed on the temp volume on the container works well it's just the network volumes that are getting hit e.g. pod: enzp9x316vp4yo
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
and serverless worker 1kxd62zyws4zq0 is currently stalling
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
yhlong00000
yhlong00000β€’4mo ago
Ran a test, seems a bit slower than normal. I’ve pinged the infra team to take a look, it’s the weekend so the response might be a bit slow. If you need a quicker workaround, you could temporarily switch to another region and copy the files over. I know it’s not ideal, but it might help for now. Appreciate your patience!
octimot
octimotβ€’4mo ago
We're actually running a live production which needs the serverless to work in the next hour. Transfering 400 GB between regions is pretty much impossible Will try to find a different solution
skyrrr
skyrrrβ€’4mo ago
I feel for you man, it's bad timing.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
Yep, thanks, but this would take around 3 days to complete We'll re-route it to our local machines Thanks for looking into this! Fingers crossed that someone will look into it soooooon 🀞
yhlong00000
yhlong00000β€’4mo ago
The network speed actually looks pretty solid, 400GB should finish transferring in about an hour.
octimot
octimotβ€’4mo ago
Yea, but the network volume that I'm transferring from is the one affected I get less than 100Mb/s And it's patchy
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
yhlong00000
yhlong00000β€’4mo ago
yeah, I am getting about 180-190 Mib/s
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
Exactly, but it goes up and down!! πŸ€“ I've benchmarked with both small and large files:
dd if=/dev/zero of=/workspace/slowtest bs=1M count=1024 oflag=direct
dd if=/dev/zero of=/workspace/testfile bs=1G count=1 oflag=dsync
dd if=/dev/zero of=/workspace/slowtest bs=1M count=1024 oflag=direct
dd if=/dev/zero of=/workspace/testfile bs=1G count=1 oflag=dsync
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
Different results: 50, 100, 180 I tried it multiple times and it goes up and down, but never more than 180-190Mb/s
pavelhamburh
pavelhamburhβ€’4mo ago
This is what I see until I get timed out while trying to run Comfy with a Network Volume. A Romanian 5090. Will this be resolved?
No description
octimot
octimotβ€’4mo ago
Same here ComfyUI takes super long to load, and then if it loads, you cannot connect to it
R1ckyH
R1ckyHβ€’4mo ago
can u try ssh in to network volume i am stuck and very lag when using it
Vincent
Vincentβ€’4mo ago
same here, dismal load times
R1ckyH
R1ckyHβ€’4mo ago
hihi, can u help me to do a test? want to confrim our problem is same or not u are also using network volume?
Machado
Machadoβ€’4mo ago
same problem here
Vincent
Vincentβ€’4mo ago
yep, using a volume network
R1ckyH
R1ckyHβ€’4mo ago
hi still here? two things u can help me 1. can u ssh into pod, and cd /workspace, try run some linux command, see if it is lag or not? 2. can u try to make a folder with many files in root, and then try to copy them to /workspace with cp -rv, and see if the copying is lag after copy about some files-> normal-> lag again? if our issue is same then i think we need to tag the admin to notice this issue
octimot
octimotβ€’4mo ago
We have been tagging everyone and opening tickets since yesterday, but no real intervention yet
haris
harisβ€’4mo ago
raised this internally, we'll be looking into it, no eta on a fix
octimot
octimotβ€’4mo ago
We had to move our infrastructure to a different provider since EU-RO-1 network volumes do not work properly Cool! I wish this was reported as an incident to be able to track it correctly from our side And to stop losing cash on booting up pods and serverless workers
octimot
octimotβ€’4mo ago
No description
octimot
octimotβ€’4mo ago
Luckly we noticed these workers running and shut them down manually.... i.e. the requests were triggering workers that were timing out
KINGLIFER
KINGLIFERβ€’4mo ago
How. I hope it isnt some long ass thing we have to do. It should be one click transfer but I can tell with this brand it wont be. Horrible service. Horrible. Do we have to pay to have pods up to do this transfer? What did you end up with?
octimot
octimotβ€’4mo ago
Running most operations locally and other providers now until the problem is fixed... We tried to do a transfer to EUR-IS-1, but it would take days since the network volume simply cannot transfer fast enough This affects multiple teams and projects on our side unfortunately Yes
meshki
meshkiβ€’4mo ago
:(
No description
meshki
meshkiβ€’4mo ago
glad i'm not the only one whose work relies on this
KINGLIFER
KINGLIFERβ€’4mo ago
There should be zero charge to swich locations. I assumed this was an established company. I no longer feel comfortable referring this service. why would you charge someone to change locations esp when one does not have the GPU or network?!?!
moss
mossβ€’4mo ago
The same issue is still occurring on EU-RO-1. Since this is incurring additional charges, we would greatly appreciate your prompt assistance.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
brennen_runpod
brennen_runpodβ€’4mo ago
Hey all - apologies for the delay, the team was able to track down the congestion on EU-RO-1’s storage cluster and resolved it at 00:37 UTC. We’ve been monitoring for the past hour - at this time, performance should be restored to normal levels.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
VK
VKβ€’4mo ago
Its still not working @brennen_runpod
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
VK
VKβ€’4mo ago
Page not found
octimot
octimotβ€’4mo ago
@brennen_runpod It still doesn't work properly on our side
root@90eb9f3820be:/# dd if=/dev/zero of=/workspace/slowtest bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.3823 s, 145 MB/s
root@90eb9f3820be:/# dd if=/dev/zero of=/workspace/slowtest bs=1M count=1024 oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 7.3823 s, 145 MB/s
This is the average speed I get ComfyUI still takes a long time to load all modules and I cannot connect to the interface at all... Is the team still on this rn? As a note, this particular network volume is 540GB β€” another volume that's considerably smaller that belongs to another team I'm part of works
Ben Lau
Ben Lauβ€’4mo ago
My result on 1GB write is much faster but it is still slow to start comfyu. Moreover, it would throw 524 error randomly.
octimot
octimotβ€’4mo ago
@Ben Lau ComfyUI imports many small files (most of them only a few kb), so I feel that it's more relevant to test with a smaller block size This was not a problem a few days ago BTW
Ben Lau
Ben Lauβ€’4mo ago
Let's try for small files
mkdir -p /workspace/disks
$ time for ((i = 1; i <= 1000; i++)); do dd if=/dev/random of="/workspace/disks/file$i.data" bs=1k count=1 2>/dev/null; done
real 0m48.135s
user 0m0.367s
sys 0m0.670s
mkdir -p /workspace/disks
$ time for ((i = 1; i <= 1000; i++)); do dd if=/dev/random of="/workspace/disks/file$i.data" bs=1k count=1 2>/dev/null; done
real 0m48.135s
user 0m0.367s
sys 0m0.670s
time cat /workspace/disks/* >/dev/null 2>&1

real 0m19.392s
user 0m0.000s
sys 0m0.040s
time cat /workspace/disks/* >/dev/null 2>&1

real 0m19.392s
user 0m0.000s
sys 0m0.040s
I encountered a similar problem starting on 26th June, but it became more severe yesterday.
skyrrr
skyrrrβ€’4mo ago
Same for me, it's still terribly slow
VK
VKβ€’4mo ago
Problem is not just that, I lost my $10 balance in all this mess @brennen_runpod
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
The loading is one thing, but the overall slowness is a big issue: ComfyUI doesn't load in the browser, and if it does,models load very slow, files cannot be uploaded or downloaded etc.
siytek
siytekβ€’4mo ago
Working better for me but still not entirely right, Jupyter is still a bit laggy
VK
VKβ€’4mo ago
Hey runpod team, are you gonna fix this or not??
siytek
siytekβ€’4mo ago
I spoke to soon, just moved from an A2000 to 4090 and its very slow to load. Burning through credit here with no output! The 4090 pod is giving me: 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.98087 s, 216 MB/s I had about 600 MB/s on the A2000 As others have said, its just the network storage, for root I get 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.61319 s, 666 MB/s Please let us know an ETA to get this resolved. If you are not going to sort this soon then I want to delete my network volume, as I am currently paying for 450GB that I can't use.
VieraLimon
VieraLimonβ€’4mo ago
I even cant enter Jupiter lab or fluxgym on my pods Ρ…_Ρ… i lost 20 min for waiting
Will
Willβ€’4mo ago
Iv been unable to use runpod the entire weekend. My entire weekend is just wasted not being able to get a second of work done. First time user experience isnt that great i tell ya
VK
VKβ€’4mo ago
Exactly same here, entire weekend wasted, no work done, also lost $10 worth balance in all this stupid mess
slz
slzβ€’4mo ago
Is it working for you guys? Still very slow here...
octimot
octimotβ€’4mo ago
Still slow for us too It seems to work better every now and then, but it's super variable and unreliable For e.g. we're struggling to upload a 6MB mp4 for 10 mins now
Will
Willβ€’4mo ago
nope
R1ckyH
R1ckyHβ€’4mo ago
waiting for runpod team resolve the problem can someone who are using the service raise a support ticket through email?
ben
benβ€’4mo ago
Finally saw this thread. I spent all day trying to move to a new network storage πŸ€¦πŸ»β€β™‚οΈ A heads up from the team would've been useful.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
Madiator2011
Madiator2011β€’4mo ago
?
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
Madiator2011
Madiator2011β€’4mo ago
what exacly slow speeds uploads/download, from where local, remote?
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
octimot
octimotβ€’4mo ago
This sounds like you folks are not even bothering at this point. Sorry, but it's really frustrating to move infrastructure during the weekend to boot up our productions here. Currently spent a lot of cash to make things work and it's impossible
Madiator2011
Madiator2011β€’4mo ago
It's just me as I'm OOO till tomorrow. Though even as support can't do much as it's infrastructure and reliability team responsibility.
octimot
octimotβ€’4mo ago
Just to make it clear: the problem started friday and it affects the EU-RO-1 network volumes We're currently experiencing i/o speeds between 10 and 180 MB/s
Madiator2011
Madiator2011β€’4mo ago
I mean EU-RO-1 is often heavy used mostly cause CPU pods
octimot
octimotβ€’4mo ago
So you're recommendation is to...?
Madiator2011
Madiator2011β€’4mo ago
usually would say change region or submit ticket so we can forward it to the team
octimot
octimotβ€’4mo ago
There are multiple of us that opened multiple tickets from different teams / accounts Please read the thread starting from above One of your team members acknowledged the issue and then they said it was fixed
Madiator2011
Madiator2011β€’4mo ago
discord is not main support platform though
octimot
octimotβ€’4mo ago
see this That is why I personally submitted a ticket friday afternoon CET
pavelhamburh
pavelhamburhβ€’4mo ago
To move to another provider. Imagine being literally down for the whole weekend, of all days
octimot
octimotβ€’4mo ago
We've booted instances locally and with other providers. The question is if this will be taken care of or not https://contact.runpod.io/hc/en-us/requests/19551?page=1
Madiator2011
Madiator2011β€’4mo ago
I do not have now access to work device so I'm unable to check
octimot
octimotβ€’4mo ago
Well, this is unproductive then, or? πŸ˜… Sorry, but you're the only Runpod rep online now
Madiator2011
Madiator2011β€’4mo ago
I mean I will be checking on the Monday but my friend works on Weekend tickets. I'm only tech support, issues like drives slow downs need to go to eng team as I do not have high level access.
octimot
octimotβ€’4mo ago
Alright! Don't mean to throw blame, sorry, I know it's not your personal fault, but we need someone from Runpod to communicate and provide support even during the weekends because this is affecting our projects
pavelhamburh
pavelhamburhβ€’4mo ago
Im spending 400$ per month on Runpod and all we get when its down for 3 days is.. nothing actually
Madiator2011
Madiator2011β€’4mo ago
I mean all are valid things.
hypopo
hypopoβ€’4mo ago
Please check other regions too when having the time. I had the issue with 4 pods on 2 regions since yesterday. As someone mentionned before, the pod startup time is one issue, but the biggest I see is regarding performance (the double ! ), last the laggy Jupyter. Ex: I'm training Flux, Before I had 4s/it, from yesterday it's 8s/it !
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
Madiator2011
Madiator2011β€’4mo ago
let me guess Fluxgym?
hypopo
hypopoβ€’4mo ago
Just Flux dev. it was EU ro and is as I remember.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
hypopo
hypopoβ€’4mo ago
But cannot check as I've deleted the pods.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
hypopo
hypopoβ€’4mo ago
OneTrainer with CLI (Onetrainer CLI 1.1 is the template name)
ben
benβ€’4mo ago
Do you guys have any recommendations on what region to move to? I don't want to end up in another one with issues.
hypopo
hypopoβ€’4mo ago
And that was with RTX5090, secure cloud but guess the issue is general with any GPU.
octimot
octimotβ€’4mo ago
We had the issue with A100 PCIe, A100 SXM, RTX PRO 6000, 5090, 4090 (this on serverless) so it's definitely GPU independent imo
Madiator2011
Madiator2011β€’4mo ago
ok did what I could do and send message on internal chat.
octimot
octimotβ€’4mo ago
Cool, thanks!
Madiator2011
Madiator2011β€’4mo ago
also tried myself and also seeing it
Madiator2011
Madiator2011β€’4mo ago
No description
octimot
octimotβ€’4mo ago
I wish there was a strategy to run without the network volume β€” this would save a lot of headaches β€” but for us it would be impossible to manage the python venv updates via image. And the small python modules are definitely the i/o bottleneck here
Will
Willβ€’4mo ago
Im kinda curious why all these big tech/ai companies who get most traffic on weekends when people are free have all their staff off lol. Civitai too. Site goes to hell every friday to monday🀣 every staff is off. Makes no sense to me
Madiator2011
Madiator2011β€’4mo ago
I have huge hopes for S3 API
octimot
octimotβ€’4mo ago
But aren't the EU-RO-1 deployed on S3 too?
Madiator2011
Madiator2011β€’4mo ago
they are test region
octimot
octimotβ€’4mo ago
πŸ˜… is this the reason why things are not working, then?
Madiator2011
Madiator2011β€’4mo ago
nope dont think so but S3 API would help to move data between regions
hypopo
hypopoβ€’4mo ago
For info and if I remember correctly: Yesterday pod could start, slow but started. But training time was the double of usual, really the double ! it Was on EU IS. Today on EU RO, I had to cancel the pod setup after waiting 10 minutes, it usually take just one minute. So it seems that some regions are slower than others but the problem is general, all GPU and regions. Note if this can help: pod deployement through SSH, on demand plan and not using network volume.
octimot
octimotβ€’4mo ago
No description
No description
octimot
octimotβ€’4mo ago
Just getting this constantly Different pods on EU-RO-1 Every 10-15 mins
Madiator2011
Madiator2011β€’4mo ago
Problem should be solved pls check
octimot
octimotβ€’4mo ago
It still doesn't work now a bunch of serverless workers started to fail again have to switch to back to local We get a lot of file not found errors, as if the network volume keeps disconnecting:
{\n "error_type": "<class 'FileNotFoundError'>",\n "error_message": "[Errno 2] No such file or directory: '/runpod-volume/ComfyUI/temp/ComfyUI_temp_pcefp_00010_.png'",\n "error_traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\", line 134, in run_job\n handler_return = handler(job)\n File \"/rp_handler.py\", line 187, in handler\n with open(image_path, 'rb') as image_file:\nFileNotFoundError: [Errno 2] No such file or directory: '/runpod-volume/ComfyUI/temp/ComfyUI_temp_pcefp_00010_.png'\n",\n "hostname": "pe2mc92j5rwzsd-644113f9",\n "worker_id": "pe2mc92j5rwzsd",\n "runpod_version": "1.6.2"\n}
{\n "error_type": "<class 'FileNotFoundError'>",\n "error_message": "[Errno 2] No such file or directory: '/runpod-volume/ComfyUI/temp/ComfyUI_temp_pcefp_00010_.png'",\n "error_traceback": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.10/dist-packages/runpod/serverless/modules/rp_job.py\", line 134, in run_job\n handler_return = handler(job)\n File \"/rp_handler.py\", line 187, in handler\n with open(image_path, 'rb') as image_file:\nFileNotFoundError: [Errno 2] No such file or directory: '/runpod-volume/ComfyUI/temp/ComfyUI_temp_pcefp_00010_.png'\n",\n "hostname": "pe2mc92j5rwzsd-644113f9",\n "worker_id": "pe2mc92j5rwzsd",\n "runpod_version": "1.6.2"\n}
Really a nightmare tbh. We'll probably switch to another provider completely next week. Can't justify this to the folks that are relying on our productions And now the serverless workers are just eating through funds like crazy
Madiator2011
Madiator2011β€’4mo ago
Why you use so old version of sdk?
onkelstony
onkelstonyβ€’4mo ago
Same issue. Spent like 2 hours trying to get something to run in EU-RO-1 - it's just not working. Looks like storage issues again...
pavelhamburh
pavelhamburhβ€’4mo ago
Now working for me also. I am able to start Comfy in the terminal, but its not getting loaded in the new tab, it just shows a loading animation and after some time it throws an error (RO network volume)
Madiator2011
Madiator2011β€’4mo ago
Tried deploy new pod? You might want share comfy logs
crystal
crystalOPβ€’4mo ago
Still pretty slow for me on startup. Also ComfyUI has been stuck in stuck in loading once it does start.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
VK
VKβ€’4mo ago
So is it working right now??
Ben Lau
Ben Lauβ€’4mo ago
user@980e5214e44e:/workspace$ mkdir -p /workspace/disks
time for ((i = 1; i <= 1000; i++)); do dd if=/dev/random of="/workspace/disks/file$i.data" bs=1k count=1 2>/dev/null; done

real 0m22.966s
user 0m0.381s
sys 0m0.715s

user@980e5214e44e:/workspace$ time cat /workspace/disks/* >/dev/null 2>&1

real 0m7.208s
user 0m0.010s
sys 0m0.033s
user@980e5214e44e:/workspace$ mkdir -p /workspace/disks
time for ((i = 1; i <= 1000; i++)); do dd if=/dev/random of="/workspace/disks/file$i.data" bs=1k count=1 2>/dev/null; done

real 0m22.966s
user 0m0.381s
sys 0m0.715s

user@980e5214e44e:/workspace$ time cat /workspace/disks/* >/dev/null 2>&1

real 0m7.208s
user 0m0.010s
sys 0m0.033s
It is much better than yesterday. @Jason @brennen_runpod @Elder Papa Madiator It works! Thank you for your effort to resolve the issue.
octimot
octimotβ€’4mo ago
The load times seem significantly faster here too. Will report back as the teams are starting their days. Thank you!
slz
slzβ€’4mo ago
It works here too! Thanks a lot @brennen_runpod
hypopo
hypopoβ€’4mo ago
Same on EU RO 1, it came back to the normal figures for performance.
pavelhamburh
pavelhamburhβ€’4mo ago
Same problem again, Comfy loads forever. Worked just 10 minutes ago
No description
No description
octimot
octimotβ€’4mo ago
I had the same issue a few minutes ago. It does feel like either the issue is still there, or that now it's a different, network-related issue, that wasn't observed earlier. The network volume speed seems really good now, but the HTTP loading time is still taking a cap every now and then, which also leads to some components of the ComfyUI interface not loading (e.g. css files etc.) Our manual fix is to see what's not loading using browser dev tools and reload those items individually, then refresh the main interface
pavelhamburh
pavelhamburhβ€’4mo ago
yeah I am not a dev so I have no idea how to do all that, I just need to generate some images man @Elder Papa Madiator
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
pavelhamburh
pavelhamburhβ€’4mo ago
No description
pavelhamburh
pavelhamburhβ€’4mo ago
incognito tab
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
ben
benβ€’4mo ago
I am also suffering from VERY slow starts on EU-RO-1 still. I migrated my network storage to US-TX-3, which loads perfectly fine (except it quickly runs out of pods lol).
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
ben
benβ€’4mo ago
Yeah, I start it on the console and it takes over 10 minutes to get running. Just a few seconds on the other region.
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
pavelhamburh
pavelhamburhβ€’4mo ago
how do you migrate a volume? mine is like 300gbs
Machado
Machadoβ€’4mo ago
Also wanna know
Ben Lau
Ben Lauβ€’4mo ago
well, I have this issue since the day one. Somehow I have accepted that as a normal behaviour.. just wait a few minutes to get it loaded.
crystal
crystalOPβ€’4mo ago
Is anyone seeing slow startup times again or is that just me? Have been trying to launch for the past few hours
Unknown User
Unknown Userβ€’4mo ago
Message Not Public
Sign In & Join Server To View
crystal
crystalOPβ€’4mo ago
It's EU-RO-1 in comfyui

Did you find this page helpful?