(Flux) Serverless inference crashes without logs.

Hi All!
I've built a FLUX inference container on Runpods serverless.
It works (sometimes) but I get a lot of random failures and Runpods does not return me the error logs.

E.g. this is the response:
'''
{
"delayTime": 151019,
"error": "job timed out after 1 retries",
"executionTime": 102002,
"id": "64de56ee-4af2-4c64-ab84-02d4a7e81593-u1",
"retries": 1,
"status": "FAILED",
"workerId": "1qjtmj861f1278"
}
'''

But no error log is reported, either in console or in the response, about what made the jobs re-try the first time.

Also the timeout should be one hour but I get this message after a few minutes.
I have also added a Telegram bot to log, but no exception is captured there as well. Did the machine just crash?

Have you experienced the same?

Runpod•16mo ago•

9 replies

deepblhe

(Flux) Serverless inference crashes without logs.

(Flux) Serverless inference crashes without logs.

Similar Threads

(Flux) Serverless inference crashes without logs.

Similar Threads

Similar Threads

Similar Threads