Cloudflare Developers•4mo ago

are you using google vertex or google ai studio? i just tested on google ai studio and seems like ca

are you using google vertex or google ai studio? i just tested on google ai studio and seems like caching is working properly. Have you set any different cache configuration for gemini? https://developers.cloudflare.com/ai-gateway/configuration/caching/

Cloudflare Docs

Caching · Cloudflare AI Gateway docs

Override caching settings on a per-request basis.

18 Replies

KathyOP•4mo ago

thanks for pointing out. ill get that fixed

laurynas•4mo ago

thank you! Experimenting still but mostly due to speed. LLaMa on Cerebras architecture outputs at 2k tokens/sec which makes a world of difference in terms of latency in the UX

BumblebeeSquare•4mo ago

Hi guys, is there a way to do analytic, collect token costs without enable logs ? i feel my user prob dont want me to see their inappropriate prompts in logs

tornado•4mo ago

I'm using Google AI studio and no changes to default caching. I did more investigation and it seems like when I send request locally things show up on the dashboard but inside my deployed cloudflare workflow only the Claude api calls show up in the dashboard.

rob•4mo ago

https://x.com/ritakozlov_/status/1885034425538187610

rita kozlov 🐀 (@ritakozlov_) on X

we shipped a binding you can use for calling @cloudflaredev ai gateway directly from a worker! if you're already using workers + workers ai, just add this to your existing code: gateway: { id: "my-gateway" } or send granular feedback w your logs → https://t.co/ZZIvA04kPb

KathyOP•4mo ago

hope yall like! 🤗

Zig•4mo ago

Does anyone know how to use the websockets api with streaming? Im trying to use the universal endpoint with openai. If i send a content type application/json header with no streaming it works fine, if i try and do it with streaming i see the request going through on the dashboard but my worker never receives any websocket messages. if i send it without a content type header or with event-stream i allways get an error about a missing model

dom•4mo ago

Using claude-3-5-sonnet-latest which currently defaults to claude-3-5-sonnet-20241022 I see o3 mini also has this issue.

usualdev•4mo ago

Feature request, add ability to adjust the column length. I don't want feedback, latency, or status to be that wide, I would rather see the model name

VIO•3mo ago

url not working https://developers.cloudflare.com/ai-gateway/request-handling/

KathyOP•3mo ago

how did you get to this url? correct one is: https://developers.cloudflare.com/ai-gateway/configuration/request-handling/

Cloudflare Docs

Request handling · Cloudflare AI Gateway docs

Your AI gateway supports different strategies for handling requests to providers, which allows you to manage AI interactions effectively and ensure your applications remain responsive and reliable.

KathyOP•3mo ago

nevermind i found it- the changelog THANKS

rob•3mo ago

https://x.com/CloudflareDev/status/1887636157963649046 👀

Cloudflare Developers (@CloudflareDev) on X

🙂We've got 2 new exciting updates for AI Gateway! •Gain observability and control over your Cerebras, ElevenLabs and Cartesia usage via AI Gateway. •Add more control to your requests with new timeout, retry, and fallback options. Learn more about these two updates below👇

luiseok•3mo ago

Does anyone get errors when trying to google gemini 1.5 flash? I know it's not an ai gateway problem, but just to be sure if I'm the only one having trouble.

rob•3mo ago

📈

Kavatch•3mo ago

There is no universal way to use ai gateway with any openai compatabil endpoint correct? I am using deepinfra and would like to use ai gateway however I don't see any options to set my own base url

morpig•3mo ago

hi! wondering if there any plans to return request ID (what we see in AI gateway logs) in workers. or is it possible already?

Unknown User•3mo ago

Message Not Public

Gaming

Programming

are you using google vertex or google ai studio? i just tested on google ai studio and seems like ca

Did you find this page helpful?