Huge P98 execution time in EU-RO region endpoint
We are seeing a huge P98 execution time in one of our EU-RO region endpoints for the past few days.
It used to be below 60s in general, but now it soared above 40 minutes.
We also see no correlation between the input text length & inference time, so just wanted to check if there is any hardware or driver releated issues in this region.
Endpoint id: 1wfnup871iklus
I suspect this also drastically increased our number of running workers.
It used to be below 60s in general, but now it soared above 40 minutes.
We also see no correlation between the input text length & inference time, so just wanted to check if there is any hardware or driver releated issues in this region.
Endpoint id: 1wfnup871iklus
I suspect this also drastically increased our number of running workers.

