I 'm running bart-large-mnli on serverless but as I can see from the worker stats it's not using the gpu, do you know what I'm doing wrong?
The image is my current handler.py
And as docker base I'm using "FROM runpod/base:0.6.2-cuda12.2.0", also tried with "runpod/pytorch:2.2.1-py3.10-cuda12.1.1-devel-ubuntu22.04" but still 0% usage of gpu.