How to Run Text Generation Inference on Serverless?
Hello newbie here, I want to run text generation inference by huggingface on serverless. I use this repo https://github.com/runpod-workers/worker-tgi, I build my own docker image according the readme and deploy on runpod serverless. But when i hit my API I get this error:
{ "delayTime": 100308, "error": "handler: module 'runpod.serverless.modules' has no attribute 'rp_metrics' \ntraceback: Traceback (most recent call last):\n File \"/opt/conda/lib/python3.10/site-packages/runpod/serverless/modules/rp_job.py\", line 194, in run_job_generator\n async for output_partial in job_output:\n File \"/handler.py\", line 75, in handler_streaming\n runpod.serverless.modules.rp_metrics.metrics_collector.update_stream_aggregate(\nAttributeError: module 'runpod.serverless.modules' has no attribute 'rp_metrics'\n", "executionTime": 376, "id": "d5ff5d8d-acf5-40a3-8ffb-1ee5ce48f8d3-e1", "status": "FAILED"}
{ "delayTime": 100308, "error": "handler: module 'runpod.serverless.modules' has no attribute 'rp_metrics' \ntraceback: Traceback (most recent call last):\n File \"/opt/conda/lib/python3.10/site-packages/runpod/serverless/modules/rp_job.py\", line 194, in run_job_generator\n async for output_partial in job_output:\n File \"/handler.py\", line 75, in handler_streaming\n runpod.serverless.modules.rp_metrics.metrics_collector.update_stream_aggregate(\nAttributeError: module 'runpod.serverless.modules' has no attribute 'rp_metrics'\n", "executionTime": 376, "id": "d5ff5d8d-acf5-40a3-8ffb-1ee5ce48f8d3-e1", "status": "FAILED"}
can anyone help me?
Recent Announcements
Continue the conversation
Join the Discord to ask follow-up questions and connect with the community
R
Runpod
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!