CPU Instances on 64 / 128 vCPUs FAIL
I can deploy my app on all instances except for 64 & 128 vCPU. Both of these run on AMD EPYC 9754 128-Core Processor. When it tries to run it gets stuck in QUEUE with the error (pasted below). When this happens it then just loops between "start container" and "failed to create shim task: the file python was not found: unknown". Any ideas what is causing this and how to resolve? There is similar issue reported in pods section here but I am using serverless and getting same problem. ERROR from instance: error creating container: container: create: Post "http://%2Fvar%2Frun%2Fdocker.sock/v1.43/containers/03f5da1a67e9f72498f779b9923cb7927a703cc84d173fa038041e72a7caac9b/start": context deadline exceeded
14 Replies
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
I know RunPod focus is on GPU instances but these must be their most profitable CPU instances. I've not experienced their support yet 🤞
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
Yeah, seems 32 vCPU/128GB is the biggest CPU instance until this issue is resolved. Too bad for my thread/RAM heavy app. 128 vCPU/256GB would be much better fit. Limits the payloads I can process 😦 oh well
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
It is a video conversion tool, ArtisanASCII. It takes video as input and converts the frames into ascii characters which forms an ASCII art video. It all depends on the scale factor from the original. The closer to a 1 to 1 scale factor (pixel to character) the RAM resources go to the moon. With all payloads I use 100% of all threads available to make it quicker. I've only been able to test on machines with 128GB RAM and cannot finish 1 to 1 scaled 1 minute video without running out a RAM. Was hoping to see what 256GB could do. I know with 128 cores it would have been VERY fast!
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
Could use disk instead of RAM but it would take at least 2 - 4 forevers to complete. LOL
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
128 vcpu is not the same as 128 cores
For me it means the same thing, 128 threads. Well, not really I am more so after the 256GB of RAM than the 128 threads but I would use them all... you know if the 128 vCPU systems actually worked.
Its definitely NOT the same thing, you can have more than 128 threads with 128 cores.
Please excuse my ignorance, it seems like you have a lot more knowledge on the subject than I do. On my home server I have 20 physical cores. In my code I spin up 20 threads and top shows near 100% usage of all cores. How can I get more than that out of the hardware? I would appreciate your thoughts on this.
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View