Search
Setup for Free
R
Runpod
•
2y ago
esho
Faster Whisper Latency is High
I test a 10
-second audio
, and i get latency about 1 second on RTX4090 after cold start
. The default is base model
, and on my own RTX3090
, the latency is about 0
.2s
.
Runpod
Join
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!
20,883
Members
View on Discord
Resources
ModelContextProtocol
ModelContextProtocol
MCP Server
Similar Threads
Was this page helpful?
Yes
No
© 2026 Hedgehog Software, LLC
Twitter
GitHub
Discord
System
Light
Dark
More
Communities
Docs
About
Terms
Privacy
J
Jason
•
4/12/24, 5:37 PM
Hi there just wondering how did you benchmark those
E
esho
OP
•
4/12/24, 6:50 PM
"import time
start
= time
.time
(
)
response
= requests
.post
(url
, json
=payload
, headers
=headers
)
print
(
"Time taken
:
"
, time
.time
(
)
- start
)
"
a very simple scirpt
, and there is
"executionTime
" in the respone
.
"executionTime
" is about 800ms
.
E
esho
"import time start = time.time() response = requests.post(url, json=payload, hea...
J
Jason
•
4/12/24, 11:23 PM
Oh is this from your pc
?
J
Jason
•
4/12/24, 11:23 PM
Or is that on your handler code
?
E
esho
OP
•
4/14/24, 2:58 AM
it
's from my PC
.
E
esho
OP
•
4/14/24, 2:58 AM
I also tested it on GCP
E
esho
it's from my PC.
J
Jason
•
4/14/24, 3:13 AM
it may be the network latency
+execution time
E
esho
OP
•
4/14/24, 4:11 PM
the executionTime is in the response
, about 800ms
. I think this is also high
.
J
Jason
•
4/14/24, 4:15 PM
oh
J
Jason
•
4/14/24, 4:15 PM
what config
(inputs
) do you use
J
Jason
•
4/14/24, 4:21 PM
its pretty average i think yeah
J
Jason
•
4/14/24, 4:21 PM
and what gpu are you using too
D
digigoblin
•
4/14/24, 4:32 PM
800ms is pretty quick actually
D
digigoblin
800ms is pretty quick actually
J
Jason
•
4/14/24, 4:34 PM
yeah pretty avg right
J
Jason
•
4/14/24, 4:35 PM
depends on what config hes using too
E
esho
OP
•
4/14/24, 6:00 PM
I am using the default config
. I think it should run as fast as the local machine
.
E
esho
OP
•
4/14/24, 6:02 PM
Although it is called serverless
, only me is using the server after cold start
. This should be really fast
.
E
esho
OP
•
4/14/24, 6:03 PM
I am using RTX3090 and RTX4090
.
E
esho
I am using RTX3090 and RTX4090.
J
Jason
•
4/15/24, 3:30 AM
Hmm yeah makes sense
E
esho
Although it is called serverless, only me is using the server after cold start. ...
J
Jason
•
4/15/24, 3:30 AM
It will be if your requests keep coming I think
J
Jason
•
4/15/24, 3:39 AM
I don
't know yet but maybe try another longer audio maybe it will be faster
Similar Threads
Failed Faster-Whisper task
R
Runpod / ⚡|serverless
10mo ago
Max concurrency in faster whisper?
R
Runpod / ⚡|serverless
4w ago
Running fine-tuned faster-whisper model
R
Runpod / ⚡|serverless
2y ago
Facing Read timeout error in faster whisper
R
Runpod / ⚡|serverless
11mo ago
Faster Whisper Latency is High - Runpod