When I test the AI Gateway with stream: true, the response is not delivered in a real-time streaming

When I test the AI Gateway with stream: true, the response is not delivered in a real-time streaming fashion. Instead, the output only appears all at once after the upstream service finishes processing. In other words, even though I set "stream": true and use Accept: text/event-stream, the gateway buffers the response and returns it all at once instead of streaming it incrementally.

Here’s the sample code I used for testing:
curl -N -X POST https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions \
  --header 'Authorization: Bearer {GOOGLE_GENERATIVE_AI_API_KEY}' \
  --header 'Content-Type: application/json' \
  --header 'Accept: text/event-stream' \
  --header 'Cache-Control: no-cache' \
  --data '{
    "model": "google-ai-studio/gemini-2.0-flash",
    "messages": [
      {
        "role": "user",
        "content": "What is Cloudflare?"
      }
    ],
    "stream": true,
    "temperature": 0.7,
    "max_tokens": 1000
  }' | while IFS= read -r line; do
    echo "[$(date '+%H:%M:%S.%3N')] $line"
  done


Could you confirm if AI Gateway currently supports real-time streaming passthrough, or if it’s expected behavior that the response is buffered and only returned once the upstream completes?
Was this page helpful?