Interesting - doesn't seem to be listed on the model page but the limit for LLaMA 2 is 2048 tokens s
Interesting - doesn't seem to be listed on the model page but the limit for LLaMA 2 is 2048 tokens so perhaps they share similar limits? https://developers.cloudflare.com/workers-ai/models/llama-2-7b-chat-int8/#properties




