Some feedback on this. Although whisper might be limited, I believe "Request is too large" is a work

Some feedback on this.
Although whisper might be limited, I believe "Request is too large" is a workers AI problem rather than a problem with the model. I tested and reproduced the exact same error with @cf/microsoft/resnet-50 and @cf/runwayml/stable-diffusion-v1-5-img2img, and believe that the same would hold for any model that accepts a large input. I used the following to test and measure:
const srcURL  = "https://cdn.openai.com/whisper/draft-20220913a/micro-machines.wav";
const res     = await fetch(srcURL);
const blob    = await res.arrayBuffer();
const jsArray = [...new Uint8Array(blob)];
const input   = { audio: jsArray };

console.log("Blob size: " + (jsArray.length / (2 << 19)).toFixed(1) + " MB");
console.log("Input array size: " + (jsArray.length / (2 << 17)).toFixed(1) + " MB");

// ai.run() stringifies input array before calling internal fetch:
//   const inpBody = JSON.stringify({ inputs: input });
//   console.log("JSON size: " + (inpBody.length / (2 << 19)).toFixed(1) + " MB");

const response = await ai.run("@cf/openai/whisper", input);

This is with 1.1.0 so that I can insert a logging statement, but changing to env.AI.run doesn't affect the outcome. The issue seems to be size, rather than dimensions (e.g. length, width, height). Changing { audio: jsArray } to { image: jsArray } and calling resnet-50 would throw the same error.

After fetching a 5 MB file, the worker has to make a copy to turn it into a ~20 MB array, assuming no overhead. The array is then stringified into a ~17 MB string. The receiving end would be faced with potentially parsing 17 MB of json with the format [123,78,30,255,0,...]. Unless there's a limit somewhere, then at some point something has to give. In this case, there seems to be a limit of just below 10 million bytes.

The immediate and simple part of the problem is that developers typically don't have a good way to handle this. I mean, it's not like the above is common knowledge.. 🙂
Was this page helpful?