H100 Replicate VS RunPod
Hi,
When I use a Flux model on Replicate, generating 4 images takes about 30 seconds and costs $0.001525 on an H100.
On the other hand, with RunPod, generating the same 4 images takes 60 seconds and costs a bit more.
How can I achieve the same processing time and cost on RunPod as I do on Replicate?
I prefer using RunPod because I have more control over the workflow.
Thanks!
45 Replies
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
Actually, the workflow I created on RunPod is the same as on Replicate, but on RunPod it's very inconsistent, it can take anywhere from 45 to 60 seconds, with a per-second cost that's higher than on Replicate. But there's nothing different; I'm just using a custom LoRA + Flux.dev. I'm going to try the L40s to see if that helps, thanks.
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
I'm using https://github.com/runpod-workers/worker-comfyui the model is already in the container, there are juste the lora in the network storage
GitHub
GitHub - runpod-workers/worker-comfyui: ComfyUI as a serverless API...
ComfyUI as a serverless API on RunPod. Contribute to runpod-workers/worker-comfyui development by creating an account on GitHub.
we plan to have public endpoints for flux very soon, will support flux dev and schnell
current times we get for 4090s is about 9 seconds per image
right now its planned for end of this month, cost is about 50% of replicate
but for flux.dev with a custom lora ?
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
we don't plan to support custom loras yet, thats a maybe for future
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
the fastest is using H100s which is about 4 seconds but then cost goes up
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
yes we plan to eventually have pro, wont be end of this month, for now its flux dev and schnell, others will come in july most likely, july is also when we plan to do video models
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
if you had the choice of faster image gen for higher cost or 2x slower image gen with half the cost, which would you pick?
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
sure thanks
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
@Jason which flux models you plan to use?
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
how fast is 1 image on replicate with flux dev?
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
from what @Simon mentioned, should be about 7-8 seconds
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
thats likely using go_fast which uses flux schnell
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
if they say it, otherwise no clue, so far best times for flux dev on h100 are 3-4s, if someone can do it faster on A100, ill be skeptical
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
schnell on H100 can do 1s for sure, dev is slower
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
just tried fal, its 2.89 seconds with 1024x1024 28 steps
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
On replicate I train a model with ostris/flux-dev-trainer then 1 image is like 9 seconds 4 -> 30 secondes
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
30 * $0.001525 = 0.045ct
what I see in replicate is
Downloaded weights in 0.64s
it's very fast Loaded LoRAs in 2.02s
it's also fast
I think they have a cache
and then 100%|██████████| 28/28 [00:08<00:00, 3.40it/s]
with an h100Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
yes it's very close
how big is the lora?
680mb
will have to explore loras in future, is that for flux dev or schnell?
Flux Dev. I'm spending $2,000 a month on Replicate, if I can get close to the same response time and price per second, I'm switching to RunPod.
are you passing loras using s3 bucket url or some other way?
Unknown User•4mo ago
Message Not Public
Sign In & Join Server To View
yes replicate store the lora directly in their server, i'm using https://replicate.com/ostris/flux-dev-lora-trainer/train and what I read is that they use the "fast-booting"
ostris/flux-dev-lora-trainer – Replicate
Fine-tune FLUX.1-dev using ai-toolkit
is it not slower putting models in the docker image ? im using network storage since i was told its faster that way
Unknown User•2mo ago
Message Not Public
Sign In & Join Server To View