does anyone know why cloudflare worker ai llama 3.1 is 3x slower than local llama 3.1 running on rt
does anyone know why cloudflare worker ai llama 3.1 is 3x slower than local llama 3.1 running on rtx3080? is there no way to speed this up? 30-40 seconds for text generation is insane. I get that it is free credits but damn that is kinda slow

