Hitting memory limits again?

Hey, so a couple months ago when we started using workers for delivering game content (using R2 as a cache layer), we ran into an issue where files that were larger than 128MB caused the request to fail at 127.99MB. We solved this by better utilising response.body.tee(), but the issue is back again as of this morning. Nothing has changed with our workers code, we essentially left it as-is once we got it working how we wanted it to, including serving files in the 300MB-800MB range. This is the code that we currently use for delivering the response, the same code that this is supposedly failing with:
// get two streams for the body (one for R2, one for responding)
const tee = response.body?.tee()

// if the response status isn't 200, respond with the status code given by the upstream
if (response.status !== 200) {
console.log(`Got HTTP ${response.status} from upstream: ${url}`)
return new Response(null, { status: response.status })
}

// attempt to put the upstream data into R2
try {
await R2_BUCKET.put(key, tee[0])
console.log(`Successfully put object ${key} into R2`)
} catch (err) {
console.log(`Error while putting ${key} into R2:`, err)
}

// return the response from upstream
return new Response(tee[1])
// get two streams for the body (one for R2, one for responding)
const tee = response.body?.tee()

// if the response status isn't 200, respond with the status code given by the upstream
if (response.status !== 200) {
console.log(`Got HTTP ${response.status} from upstream: ${url}`)
return new Response(null, { status: response.status })
}

// attempt to put the upstream data into R2
try {
await R2_BUCKET.put(key, tee[0])
console.log(`Successfully put object ${key} into R2`)
} catch (err) {
console.log(`Error while putting ${key} into R2:`, err)
}

// return the response from upstream
return new Response(tee[1])
It's worth noting that I cannot tail the worker as it's getting too much traffic (250+ invokes per second) so getting logs are out of the question. If there's anything here that can be improved, I'd much appreciate it if you could point it out. I'm not very familiar with streams so it's possible that I've made a mistake here. Thanks 🙂
14 Replies
kian
kian17mo ago
It's worth noting that I cannot tail the worker as it's getting too much traffic (250+ invokes per second) so getting logs are out of the question.
You can filter tail to only be from your IP, or another one.
nex
nex17mo ago
I didn't know about that, but that's an incredibly useful tip
kian
kian17mo ago
I'm not very familiar with streams so it's possible that I've made a mistake here.
You could run into issues if one of the streams is being read slowly - i.e a slow internet connection from the upstream
nex
nex17mo ago
I'll look into that now we have to use a little hack where we reverse proxy the source of the files using nginx so that workers can pull from it (port restrictions)
Error while putting <key> into R2: TypeError: ReadableStream.tee() buffer limit exceeded. This error usually occurs when a Request or Response with a large body is cloned, then only one of the clones is read, forcing the Workers runtime to buffer the entire body in memory. To fix this issue, remove unnecessary calls to Request/Response.clone() and ReadableStream.tee(), and always read clones/tees in parallel.
Error while putting <key> into R2: TypeError: ReadableStream.tee() buffer limit exceeded. This error usually occurs when a Request or Response with a large body is cloned, then only one of the clones is read, forcing the Workers runtime to buffer the entire body in memory. To fix this issue, remove unnecessary calls to Request/Response.clone() and ReadableStream.tee(), and always read clones/tees in parallel.
kian
kian17mo ago
Yeah
kian
kian17mo ago
kian
kian17mo ago
The exact error I have open right now, haha
nex
nex17mo ago
I guess this is a new change with workers? this never used to happen until today-ish
kian
kian17mo ago
No idea on the exact date but it's been there since the release of the OSS runtime, so earlier than September
nex
nex17mo ago
that's before the time where we started to use workers for this do you think the await could be the issue in the above snippet on line 12? thinking of this:
To fix this issue, remove unnecessary calls to Request/Response.clone() and ReadableStream.tee(), and always read clones/tees in parallel.
there's no clone calls, and only one call to tee()
kian
kian17mo ago
I'd say so, yes. You can do your put in a waitUntil but that means the upstream must download the file within 30 seconds - currently.
kian
kian17mo ago
Stack Overflow
In a Cloudflare worker why the faster stream waits the slower one w...
I want to fetch an asset into R2 and at the same time return the response to the client. So simultaneously streaming into R2 and to the client too. Related code fragment: const originResponse = await
nex
nex17mo ago
that seemed to have worked thanks @kiannh ❤️
kian
kian17mo ago
No problem