llama3-70b-8192 but I cant deploy my serverless endpoint because out of memory.[rank0]: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 896.00 MiB. GPU
Join the Discord to ask follow-up questions and connect with the community
We're a community of enthusiasts, engineers, and enterprises, all sharing insights on AI, Machine Learning and GPUs!
21,906 Members
Join