Help deploying LLaVA Flask API

I'm trying to create a LLaVa endpoint I can use in my project so I can assess 5 million photos with a Node script, similar to how I'm doing right locally currently with Ollama. I'm looking to deploy the 7b model on an RTX 4000, GPU Cloud not serverless to keep costs down. My preference is speed as well as cost so I'd ideally like to process multiple images at once, any advice welcome.

After speaking to the author of the LLaVA RunPod template, he's recommended I use the below Flask method, but I'm not sure how I'd go around getting this deployed as I'm new to backend. Anybody able to help with some initial steps?

https://github.com/ashleykleynhans/LLaVA/tree/main?tab=readme-ov-file#flask-api-inference

#LLaVA: Large Language and Vision Assistant #

｜gpu-cloud

GitHub

GitHub - ashleykleynhans/LLaVA: [NeurIPS 2023 Oral] Visual Instruct...

[NeurIPS 2023 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards multimodal GPT-4 level capabilities. - GitHub - ashleykleynhans/LLaVA: [NeurIPS 2023 Oral] V...

No replies yetBe the first to reply to this messageJoin

Help deploying LLaVA Flask API

Similar Threads

Similar Threads

Similar Threads