© 2026 Hedgehog Software, LLC

Twitter GitHub Discord

More

Communities Docs About Terms Privacy

Llama - Runpod

Runpod•2y ago•

6 replies

Llama

Hello! For those who tried, how much GPU is needed for inference only, and for fine-tuning of Llama 70B? How about the inference of the 400B version (for knowledge distillation)? Is the quality difference worth it? Thanks!

Solution

Only for inference

Jump to solution

We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!

21,906Members

Resources

Similar Threads

Was this page helpful?

Recent Announcements

Similar Threads

Unsloth Llama Scout will not download

RRunpod / ⛅｜pods

Running LLaMA remotely from a Python script

RRunpod / ⛅｜pods

llama.cpp Fails with 500 Error on Concurrent Requests

RRunpod / ⛅｜pods

RRunpod / ⛅｜pods