Geo-Redundant network storage
Hey I have a Flux and Wan 2.1 generation serverless setup working. Right now every base model, text encoder and VAE gets baked into the Docker image. Rebuilds and deploys take a lot of time but thats to be expected with 100GB+ images.
I'm right now busy introducing new functionality for our userbase to train their own lora and use it right away after training. Currently, finished trained Loras go to Azure Blob storage and are loaded through a ComfyUI node that loads Loras from remote URL. The issue is that download times range from 10 s to 60 s depending on the worker’s region. It works but getting billed to download loras for every run isn't ideal
I have a very specific configuration I optimized my builds for which is CUDA 12.8 + H100's. Most locations where my workers are don't even have the possibility to create a network volume there. Which is why I'm kinda stuck now. The current implementation of Runpod network volumes by narrowing to one region limits where workers can spin up.. even though the read speed are way better.
Any chance you could add geo‑replicated network storage? I’d gladly pay a premium for fast, low‑latency pulls for the lora models without losing location and CUDA configuration flexibility.
2 Replies
Nvm.. wrote my own custom Comfy node using aria2 or azcopy...
both are able to pull the 300MB loras within 6 seconds on a cold fetch + added CDN with caching
Fetch file from CDN cache ~2 sec
[LoadLoraFromURL] Downloader: aria2, Time: 1.68s

Unknown User•2mo ago
Message Not Public
Sign In & Join Server To View