could maybe use some other proxy server (like nginx on a vps) to proxy the requests to workers but t
could maybe use some other proxy server (like nginx on a vps) to proxy the requests to workers but that'd be another step

noble-hashes. That library provides an async implementation which just uses promise callbacks (microtask queue) to break up the calculation: https://github.com/paulmillr/noble-hashes/blob/ae060daa6252f3ff2aa2f84e887de0aab491281d/src/utils.ts#L103-L119.noble-hashes versions then, just the significant overhead of enqueueing/dequeueing the stack onto the microtask queue.does anyone know why cloudflare worker ai llama 3.1 is 3x slower than local llama 3.1 running on rtx3080? is there no way to speed this up? 30-40 seconds for text generation is insane. I get that it is free credits but damn that is kinda slowAI team explain more in their channel, I'm no AI guy and don't know 100% their setup, but comparing local vs remote seems a bit silly. Workers AI is powered by a ton of shared GPUs, vs your one unshared gpu, and they've got lots of magic in front of it with request routing/etc to try to scale/shard requests. There's lots of different ways to run models too is my understanding, each with different quirks
load-context to Vite which will just contain env stuff. The worker/server.ts is never actually hit so it can't consume the queuerequest.cf.colo to get the IATA code/cdn-cgi/trace be against any Cloudflare "orange clouded" hostname?https://cloudflare.com/cdn-cgi/trace@clouflare/workers-types:workerd runtime.export interface TraceMetrics {
readonly cpuTime: number;
readonly wallTime: number;
}
export interface UnsafeTraceMetrics {
fromTrace(item: TraceItem): TraceMetrics;
}