1
I see mentions of keeping a model in a Network Volume to share between all endpoints. But if I already have my model inside of a container image-- wouldn't my model already be cached in that image? Which would be faster for cold boots?
2
My workload is not consistent, so I understand FlashBoot is unlikely to help a lot-- but is there any reason not to enable it? When I hover over it, it indicates to test output quality first-- what does this mean and why?
3
What is "container disk"? My models are already inside my image and they seem to load fine-- so what is the purpose of this? Additional space to be used at runtime-- like if I was downloading a model when the container starts?