Serverless Pod Disk Space Issue with Large Model (FLUX.1-schnell)

I'm trying to deploy a Hugging Face diffusion model (black-forest-labs/FLUX.1-schnell) on a serverless GPU endpoint, but I'm running into a "No space left on device" error during model download.

Even though I selected 100GB storage volume, logs show that only ~5GB disk space is being used, and downloads fail due to lack of space.

1. Why isn't the full 100GB volume being used?
2. How can I redirect Hugging Face cache to the mounted volume to avoid this issue?
3. Is serverless not suitable for large diffusion models like FLUX.1-schnell?

Any advice or workaround would be appreciated. Thanks!