That is because I believe the blog was pulled, so you were seeing a cached version that is no longer

IIsaac McFadyen That is because I believe the blog was pulled, so you were seeing a cached versi...

C

Comrade Nalaxone•9/9/25, 11:57 PM

Makes sense. Thanks for updating me. What's the status on FedRamp + Cloudflare AI?

C

Comrade Nalaxone•9/9/25, 11:57 PM

Is it still something that's happening?

D

dave•9/11/25, 2:45 AM

FYI there is a bug in cost calculation where it's set to zero if input tokens is zero

D

dave•9/11/25, 2:45 AM

Legit example from anthropic:

"usage": {
    "input_tokens": 0,
    "cache_creation_input_tokens": 575,
    "cache_read_input_tokens": 106910,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 575,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 957,
    "service_tier": "standard"
  }

"usage": {
    "input_tokens": 0,
    "cache_creation_input_tokens": 575,
    "cache_read_input_tokens": 106910,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 575,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 957,
    "service_tier": "standard"
  }

"usage": {
    "input_tokens": 0,
    "cache_creation_input_tokens": 575,
    "cache_read_input_tokens": 106910,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 575,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 957,
    "service_tier": "standard"
  }

"usage": {
    "input_tokens": 0,
    "cache_creation_input_tokens": 575,
    "cache_read_input_tokens": 106910,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 575,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 957,
    "service_tier": "standard"
  }

D

dave•9/11/25, 12:25 PM

It would be nice if the binding allowed use to provider endpoints instead of just the universial one, since otherwise we have to manually provide an auth token for AI Gateway.

https://developers.cloudflare.com/ai-gateway/configuration/authentication/#expected-behavior

Jjoe am I misunderstanding how metadata filtering is supposed to work? if I attach sa...

J

joe•9/11/25, 5:03 PM

just echoing this? it completely negates the point of being able to filter by metadata keys

D

dave•9/11/25, 6:39 PM

How do I force https://blog.cloudflare.com/ai-side-channel-attack-mitigated/ to be off?

S

sakty•9/11/25, 7:46 PM

what's the limit requests / day for this ai: https://developers.cloudflare.com/workers-ai/models/llama-3.2-11b-vision-instruct/

Cloudflare Docs

llama-3.2-11b-vision-instruct

The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.

S

sakty•9/11/25, 7:47 PM

for free user

SSmallShen Is this intended? streaming reqeusts input output does not get counted? Even th...

S

SmallShen•9/11/25, 11:20 PM

anyone have idea on this one?

S

SmallShen•9/12/25, 2:41 AM

Does ai gateway compat support Responses api?

T

Timo•9/12/25, 12:17 PM

Hi, has anyone experienced degraded accuracy when using gemini-2.5-pro through the AI gateway?

T

Timo•9/12/25, 12:18 PM

We've investigated and for some reason tokens are lost when going through the gateway vs calling the google api directly.

D

dave•9/16/25, 12:24 PM

AI Gateway is incorrectly considering a streaming

refusal

refusal

refusal

refusal on the Anthropic API to be successful. https://docs.anthropic.com/en/docs/test-and-evaluate/strengthen-guardrails/handle-streaming-refusals#implementation-guide

Here's is a sample response:

{
  "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8",
  "type": "message",
  "role": "assistant",
  "model": "claude-opus-4-1-20250805",
  "content": "",
  "stop_reason": null,
  "stop_sequence": null,
  "usage": {
    "input_tokens": 9,
    "cache_creation_input_tokens": 16203,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 16203,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 0
  },
  "streamed_data": [
    {
      "nonce": "e8071956",
      "type": "message_start",
      "message": {
        "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8",
        "type": "message",
        "role": "assistant",
        "model": "claude-opus-4-1-20250805",
        "content": [],
        "stop_reason": null,
        "stop_sequence": null,
        "usage": {
          "input_tokens": 9,
          "cache_creation_input_tokens": 16203,
          "cache_read_input_tokens": 0,
          "cache_creation": {
            "ephemeral_5m_input_tokens": 16203,
            "ephemeral_1h_input_tokens": 0
          },
          "output_tokens": 0
        }
      }
    },
    {
      "type": "message_delta",
      "delta": {
        "stop_reason": "refusal",
        "stop_sequence": null
      },
      "usage": {
        "input_tokens": 9,
        "cache_creation_input_tokens": 16203,
        "cache_read_input_tokens": 0,
        "output_tokens": 0
      }
    },
    {
      "nonce": "8aeda459",
      "type": "message_stop"
    }
  ]
}

{
  "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8",
  "type": "message",
  "role": "assistant",
  "model": "claude-opus-4-1-20250805",
  "content": "",
  "stop_reason": null,
  "stop_sequence": null,
  "usage": {
    "input_tokens": 9,
    "cache_creation_input_tokens": 16203,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 16203,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 0
  },
  "streamed_data": [
    {
      "nonce": "e8071956",
      "type": "message_start",
      "message": {
        "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8",
        "type": "message",
        "role": "assistant",
        "model": "claude-opus-4-1-20250805",
        "content": [],
        "stop_reason": null,
        "stop_sequence": null,
        "usage": {
          "input_tokens": 9,
          "cache_creation_input_tokens": 16203,
          "cache_read_input_tokens": 0,
          "cache_creation": {
            "ephemeral_5m_input_tokens": 16203,
            "ephemeral_1h_input_tokens": 0
          },
          "output_tokens": 0
        }
      }
    },
    {
      "type": "message_delta",
      "delta": {
        "stop_reason": "refusal",
        "stop_sequence": null
      },
      "usage": {
        "input_tokens": 9,
        "cache_creation_input_tokens": 16203,
        "cache_read_input_tokens": 0,
        "output_tokens": 0
      }
    },
    {
      "nonce": "8aeda459",
      "type": "message_stop"
    }
  ]
}

{
  "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8",
  "type": "message",
  "role": "assistant",
  "model": "claude-opus-4-1-20250805",
  "content": "",
  "stop_reason": null,
  "stop_sequence": null,
  "usage": {
    "input_tokens": 9,
    "cache_creation_input_tokens": 16203,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 16203,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 0
  },
  "streamed_data": [
    {
      "nonce": "e8071956",
      "type": "message_start",
      "message": {
        "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8",
        "type": "message",
        "role": "assistant",
        "model": "claude-opus-4-1-20250805",
        "content": [],
        "stop_reason": null,
        "stop_sequence": null,
        "usage": {
          "input_tokens": 9,
          "cache_creation_input_tokens": 16203,
          "cache_read_input_tokens": 0,
          "cache_creation": {
            "ephemeral_5m_input_tokens": 16203,
            "ephemeral_1h_input_tokens": 0
          },
          "output_tokens": 0
        }
      }
    },
    {
      "type": "message_delta",
      "delta": {
        "stop_reason": "refusal",
        "stop_sequence": null
      },
      "usage": {
        "input_tokens": 9,
        "cache_creation_input_tokens": 16203,
        "cache_read_input_tokens": 0,
        "output_tokens": 0
      }
    },
    {
      "nonce": "8aeda459",
      "type": "message_stop"
    }
  ]
}

{
  "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8",
  "type": "message",
  "role": "assistant",
  "model": "claude-opus-4-1-20250805",
  "content": "",
  "stop_reason": null,
  "stop_sequence": null,
  "usage": {
    "input_tokens": 9,
    "cache_creation_input_tokens": 16203,
    "cache_read_input_tokens": 0,
    "cache_creation": {
      "ephemeral_5m_input_tokens": 16203,
      "ephemeral_1h_input_tokens": 0
    },
    "output_tokens": 0
  },
  "streamed_data": [
    {
      "nonce": "e8071956",
      "type": "message_start",
      "message": {
        "id": "msg_01HCDu5LRGeP2o7s2xGmxyx8",
        "type": "message",
        "role": "assistant",
        "model": "claude-opus-4-1-20250805",
        "content": [],
        "stop_reason": null,
        "stop_sequence": null,
        "usage": {
          "input_tokens": 9,
          "cache_creation_input_tokens": 16203,
          "cache_read_input_tokens": 0,
          "cache_creation": {
            "ephemeral_5m_input_tokens": 16203,
            "ephemeral_1h_input_tokens": 0
          },
          "output_tokens": 0
        }
      }
    },
    {
      "type": "message_delta",
      "delta": {
        "stop_reason": "refusal",
        "stop_sequence": null
      },
      "usage": {
        "input_tokens": 9,
        "cache_creation_input_tokens": 16203,
        "cache_read_input_tokens": 0,
        "output_tokens": 0
      }
    },
    {
      "nonce": "8aeda459",
      "type": "message_stop"
    }
  ]
}

Anthropic

Streaming refusals - Anthropic

D

dave•9/19/25, 5:18 AM

Is it not possible to delete dynamic routes?

.

.r20•9/19/25, 7:03 PM

ive been wondering the same thing too i cant find a way to do it

B

Beefp•9/19/25, 7:07 PM

Any plans to support OpenAIs responses API? (and offer compatibility with other models) - would be an absolute game changer compared to other AI gateways

.

.r20•9/19/25, 9:44 PM

it already supports responses api

.

.r20•9/19/25, 9:44 PM

just use the compat endpoint they give you for the base url and you can just do client.response like normal

.

.r20•9/19/25, 9:44 PM

no changes necessary

..r20 no changes necessary

S

SmallShen•9/20/25, 12:14 AM

I think the compat endpoint only support completions api?

.

.r20•9/20/25, 12:56 AM

ive used responses api with ai gateway before i just dont remember which endpoint i used tbh

K

karimulm•9/22/25, 5:18 AM

Dynamic routes menu missing pagination. No pagination controls available, making page 2+ data inaccessible.

Ddave Is it not possible to delete dynamic routes?

P

Puliczek•9/22/25, 10:24 AM

same problem

P

Puliczek•9/22/25, 10:25 AM

models/gemini-2.5-flash-lite is not found for API version v1, or is not supported for generateContent. Call ListModels to see the list of available models and their supported methods

models/gemini-2.5-flash-lite is not found for API version v1, or is not supported for generateContent. Call ListModels to see the list of available models and their supported methods

models/gemini-2.5-flash-lite is not found for API version v1, or is not supported for generateContent. Call ListModels to see the list of available models and their supported methods

models/gemini-2.5-flash-lite is not found for API version v1, or is not supported for generateContent. Call ListModels to see the list of available models and their supported methods

How to call gemini-2.5-flash-lite?

P

Puliczek•9/22/25, 11:05 AM

btw gemini-2.5-flash is working

P

Puliczek•9/22/25, 11:07 AM

but i need gemini-2.5-flash-lite

P

Puliczek•9/22/25, 11:17 AM

gemini-2.5-flash-lite model works when I use @google/generative-ai. So problem solved, i think now i can't use own cf-metadata.

D

dave•9/22/25, 8:56 PM

HTTP 525 issue with Anthropic models with AI Gateway it seems?

S

SmallShen•9/23/25, 5:24 AM

I think there is a performance issue for ai gateway ui dashboard, if the streaming response is large, clicking(expand) the log would lag for a sec.

S

SmallShen•9/23/25, 5:26 AM

There is also a ui bug in dynamic routes dashboard, the background color is incorrect when in dark theme.

.

.r20•9/25/25, 11:40 PM

whats the prompts tab for in ai gateway?

.

.r20•9/25/25, 11:40 PM

its not showing anything

N

noway5566•9/27/25, 2:49 PM

When I test the AI Gateway with stream: true, the response is not delivered in a real-time streaming fashion. Instead, the output only appears all at once after the upstream service finishes processing. In other words, even though I set "stream": true and use Accept: text/event-stream, the gateway buffers the response and returns it all at once instead of streaming it incrementally.

Here’s the sample code I used for testing:

curl -N -X POST https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions \
  --header 'Authorization: Bearer {GOOGLE_GENERATIVE_AI_API_KEY}' \
  --header 'Content-Type: application/json' \
  --header 'Accept: text/event-stream' \
  --header 'Cache-Control: no-cache' \
  --data '{
    "model": "google-ai-studio/gemini-2.0-flash",
    "messages": [
      {
        "role": "user",
        "content": "What is Cloudflare?"
      }
    ],
    "stream": true,
    "temperature": 0.7,
    "max_tokens": 1000
  }' | while IFS= read -r line; do
    echo "[$(date '+%H:%M:%S.%3N')] $line"
  done

curl -N -X POST https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions \
  --header 'Authorization: Bearer {GOOGLE_GENERATIVE_AI_API_KEY}' \
  --header 'Content-Type: application/json' \
  --header 'Accept: text/event-stream' \
  --header 'Cache-Control: no-cache' \
  --data '{
    "model": "google-ai-studio/gemini-2.0-flash",
    "messages": [
      {
        "role": "user",
        "content": "What is Cloudflare?"
      }
    ],
    "stream": true,
    "temperature": 0.7,
    "max_tokens": 1000
  }' | while IFS= read -r line; do
    echo "[$(date '+%H:%M:%S.%3N')] $line"
  done

curl -N -X POST https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions \
  --header 'Authorization: Bearer {GOOGLE_GENERATIVE_AI_API_KEY}' \
  --header 'Content-Type: application/json' \
  --header 'Accept: text/event-stream' \
  --header 'Cache-Control: no-cache' \
  --data '{
    "model": "google-ai-studio/gemini-2.0-flash",
    "messages": [
      {
        "role": "user",
        "content": "What is Cloudflare?"
      }
    ],
    "stream": true,
    "temperature": 0.7,
    "max_tokens": 1000
  }' | while IFS= read -r line; do
    echo "[$(date '+%H:%M:%S.%3N')] $line"
  done

curl -N -X POST https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/compat/chat/completions \
  --header 'Authorization: Bearer {GOOGLE_GENERATIVE_AI_API_KEY}' \
  --header 'Content-Type: application/json' \
  --header 'Accept: text/event-stream' \
  --header 'Cache-Control: no-cache' \
  --data '{
    "model": "google-ai-studio/gemini-2.0-flash",
    "messages": [
      {
        "role": "user",
        "content": "What is Cloudflare?"
      }
    ],
    "stream": true,
    "temperature": 0.7,
    "max_tokens": 1000
  }' | while IFS= read -r line; do
    echo "[$(date '+%H:%M:%S.%3N')] $line"
  done

Could you confirm if AI Gateway currently supports real-time streaming passthrough, or if it’s expected behavior that the response is buffered and only returned once the upstream completes?

A

Acterion•9/27/25, 5:32 PM

New to AI gateway
Trying to set it up with direct connection with OpenWeb-UI hosted on my subdomain (using CF tunnel to my own infra) and constatnly getting CORS errors when sending OPTIONS request to /completions endpoint. Seems like CF either dropping Allow-Cross-Origin headers or/and not working correctly with OPTIONS requests.
Did anyone had success configuring OpenWebUI to work with Gateway before?

M

mongj•9/28/25, 7:16 AM

hey does anyone know if the AI gateway supports sending audio files in a multi-part form request for STT models like whisper?

I was testing out the groq endpoint but consistently got 400s for valid audio files. URL input seems to work correctly.

M

mongj•9/28/25, 7:16 AM

here's a basic request I tested with

curl -X POST https://gateway.ai.cloudflare.com/v1/<cloudflare-acco
unt-id>/<gateway-name>/groq/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -H "Authorization: Bearer <token>" \
  -F "model=whisper-large-v3-turbo" \
  -F "file=@/path/to/audio.mp3"

curl -X POST https://gateway.ai.cloudflare.com/v1/<cloudflare-acco
unt-id>/<gateway-name>/groq/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -H "Authorization: Bearer <token>" \
  -F "model=whisper-large-v3-turbo" \
  -F "file=@/path/to/audio.mp3"

curl -X POST https://gateway.ai.cloudflare.com/v1/<cloudflare-acco
unt-id>/<gateway-name>/groq/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -H "Authorization: Bearer <token>" \
  -F "model=whisper-large-v3-turbo" \
  -F "file=@/path/to/audio.mp3"

curl -X POST https://gateway.ai.cloudflare.com/v1/<cloudflare-acco
unt-id>/<gateway-name>/groq/audio/transcriptions \
  -H "Content-Type: multipart/form-data" \
  -H "Authorization: Bearer <token>" \
  -F "model=whisper-large-v3-turbo" \
  -F "file=@/path/to/audio.mp3"

M

mongj•9/28/25, 7:16 AM

and the response

{"error":{"message":"could not process file - is it a valid media file?","type":"invalid_request_error"}}

{"error":{"message":"could not process file - is it a valid media file?","type":"invalid_request_error"}}

{"error":{"message":"could not process file - is it a valid media file?","type":"invalid_request_error"}}

{"error":{"message":"could not process file - is it a valid media file?","type":"invalid_request_error"}}

M

mongj•9/28/25, 7:16 AM

Using a url pointing to an uploaded version of the same audio file works as intended.

C

CodingCoop | Co-Founder Nullshot•9/28/25, 1:04 PM

How can I get access to this - applied a few weeks ago and no response.

About to lean into open router but would love to partner:

https://blog.cloudflare.com/ai-gateway-aug-2025-refresh/

wondering if joining the startup program would help push this through?

The Cloudflare Blog

AI Gateway now gives you access to your favorite AI models, dynamic...

AI Gateway simplifies AI app development with unified billing, secure key storage, and dynamic routing. Gain observability and control over costs, API keys, and traffic, connecting to major AI providers through a single endpoint.

L

Leszek•9/29/25, 7:14 AM

Hey, I've created my first AI Search instance, with AI Gateway where I added the OpenAI key.
I've added my first document and I get an

error

error

error

error while building embeddings during the sync.

I've chosen an OpenAI embedding model. The error I am getting is about missing token... It appears, AI Search is not using the token I have provided within the AI Gateway.

{
  "error": {
    "message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

{
  "error": {
    "message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

{
  "error": {
    "message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

{
  "error": {
    "message": "You didn't provide an API key. You need to provide your API key in an Authorization header using Bearer auth (i.e. Authorization: Bearer YOUR_KEY), or as the password field (with blank username) if you're accessing the API from your browser and are prompted for a username and password. You can obtain an API key from https://platform.openai.com/account/api-keys.",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

LLeszek Hey, I've created my first AI Search instance, with AI Gateway where I added the...

M

misterPaul•9/29/25, 11:26 AM

have you set up "Authenticated Gateway" on the settings page and passed the token in the cf-aig-authorization header as a bearer token

M

misterPaul•9/29/25, 11:38 AM

Go to your AI Search instance > Settings > Models. Is the Gateway field pointing to the correct AI Gateway where you stored the key?
In that specific AI Gateway, under "Provider Keys", is the OpenAI key definitely there and valid?

Those are the two obvious things to check, after that check the "Authenticated Gateway" setting I mentioned

L

Leszek•9/29/25, 11:46 AM

ave you set up "Authenticated Gateway" on the settings page

yes

and passed the token in the cf-aig-authorization header as a bearer token

no. I found no place for that.

I checked again right now and everything works.... So I guess it was an issue on the Cloudflare side.

Thank you for your help!

.

.r20•9/29/25, 5:39 PM

not sure if this is a bug but ai gateway created datasets dont support using metadata filters

.

.r20•9/29/25, 5:39 PM

D

dave•9/30/25, 3:57 PM

If I update a Secret Store value that's managed by AI Gateway, should that work?

Nnoway5566 When I test the AI Gateway with stream: true, the response is not delivered in a...

B

Brandon Studio•10/1/25, 11:44 AM

Same issue here. Could anyone help?

B

Brandon Studio•10/1/25, 11:44 AM

I have a video record

Nnoway5566 When I test the AI Gateway with stream: true, the response is not delivered in a...

D

dave•10/1/25, 10:14 PM

This is a regression afaik, iirc @Kathy might remember we talked about it and I think there was an internal ticket?

That is because I believe the blog was pulled, so you were seeing a cached version that is no longer

Similar Threads