i'm facing issue from last 2 days, sometime RTX 4090 generates 60 token/second and sometime it 30-20 token/second to generate same response. don't know what is behind this ????
No replies yet
Join the Discord to continue the conversation
R
Runpod
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!