Ensuring Task Routing to Warm Workers for FlashBoot VRAM Persistence

Hi team,

I’m using FlashBoot and my understanding is that the container should stay alive after a job finishes so that the model remains loaded in VRAM, reducing cold start time.

However, after I activate worker A, the next task often gets scheduled on worker B instead. This defeats the purpose of FlashBoot because the preloaded model in worker A is never reused.

Question:
Is there any way to prioritize scheduling new tasks onto an already-active FlashBoot worker? Or force tasks onto the worker that already has the model loaded?

This is crucial for minimizing cold starts.

Thanks!

Runpod•4mo ago•

2 replies

전상윤

Ensuring Task Routing to Warm Workers for FlashBoot VRAM Persistence

Ensuring Task Routing to Warm Workers for FlashBoot VRAM Persistence

Similar Threads

Ensuring Task Routing to Warm Workers for FlashBoot VRAM Persistence

Similar Threads

Similar Threads

Similar Threads