Hello! My team and I are moving to runpod and are preparing to scale big and I wanted to share what we are doing in hope that someone finds a better way to handle it.
We currently have one universal Dockerfile that prepares the folders, and creates symlinks to models and custom_nodes we have in an S3 volume which the serverless instances have access to.
This S3 volume has all the models and custom_nodes folders, and each GPU on boot installs all dependencies of custom_nodes.
Does all of this make sense or is there a better way to do it?
After spending the entire week building this I saw there is a "ComfyUI-to-API" tool but it prepares a dockerfile that makes each container on each GPU to download everything every time, which doesnt sound very interesting to reduce costs on amount of time spent...
Another big issue:
My workflows tend to make CPU AND GPU at 100%, and im pretty sure CPU being at 100% makes the images be generated with very weird looking artifacts, and the overall composition doesnt make sense, but running the same workflow locally on an even worse GPU works fine...