Can we add minimum GPU configs required for running the popular models like Mistral, Mixtral?

I'm trying to find what serverless GPU configs are required to run Mixtral 8x7B-Instruct either quantized (https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ) or the main from Mistral. It would be good to have this info in the ReadMe in vLLM Worker Repo.

I run into OutOfMemory issues when trying it on 48GB GPU.

TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ · Hugging Face

Communities Docs About Terms Privacy

Can we add minimum GPU configs required for running the popular models like Mistral, Mixtral? - Runpod

Can we add minimum GPU configs required for running the popular models like Mistral, Mixtral?

Similar Threads

Can we add minimum GPU configs required for running the popular models like Mistral, Mixtral?

Similar Threads

Similar Threads

Similar Threads