Faster Whisper Latency is High
I test a 10-second audio , and i get latency about 1 second on RTX4090 after cold start. The default is base model, and on my own RTX3090, the latency is about 0.2s.
11 Replies
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
"import time
start = time.time()
response = requests.post(url, json=payload, headers=headers)
print("Time taken: ", time.time() - start)"
a very simple scirpt, and there is "executionTime" in the respone. "executionTime" is about 800ms.
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
it's from my PC.
I also tested it on GCP
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
the executionTime is in the response, about 800ms. I think this is also high.
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
800ms is pretty quick actually
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View
I am using the default config. I think it should run as fast as the local machine.
Although it is called serverless, only me is using the server after cold start. This should be really fast.
I am using RTX3090 and RTX4090.
Unknown User•2y ago
Message Not Public
Sign In & Join Server To View