Start and stop multiple pods
I have a product that will allow users to submit video editing requests that can range anywhere from 0-8 minutes of RTX 4090 GPU processing each to complete. To manage the multiple requests, I wanted to implement a system that turns on and off a group of GPUs all running the same docker image. This way if requests are high at a given time they could all still be handled. However in my experience, when pods are stopped, it can be the case that the GPU attached to it is no longer available when I attempt to restart it at a later point. This would obviously be a problem because if a GPU is no longer available when a request requires the pod to turn back on, the request would not be able to process correctly. Is there any way to go around this issue of GPUs being unassigned so I can turn pods on and off and make this system feasible? I saw the serverless option which seems like it would work for this product but the cost does not seem feasible. Thank you!