Error: socket hang up when sending 30+ parallel requests to a worker

Hi! I have a cloudflare worker that acts as a proxy for downloading files from a S3-like service. I'm running into a bug where my client code (a node.js script) throw a socket hang up error when I download 30+ files in parallel on a slow connection (780kbps bandwidth via Mac OS network link conditioning) Here are some observations and details about the issue: - I can reproduce the bug ~90% of the time - The bug does not occur if I disable network link conditioning and use a 100Mbps network. - the bug only starts happening when I download ~30 files in parallel on a slow connection. Downloading ~20 files works fine on that connection - when I check the real-time logs using wrangler tail, I do not see any logs for the request that throws the socket hang up error. However, other requests are logged correctly. - Using the S3-like service without the cloudlare worker as a proxy doesn't cause the error to happen - I looked at the traffic via Wireshark and a RST packet gets sent from cloudflare when the socket hang up error gets thrown - I can reproduce the bug on my fast connection as well but it's very rare I would appreciate any suggestions on what I should investigate or which Cloudflare logs I should examine. Thank you very much for your assistance!
2 Replies
zegevlier
zegevlier12mo ago
Have you tried several slow connections from different IPs? And have you tried from the same IP going to a real origin, instead of a worker? It seems to me that this could be some sort of DDoS protection, that they don't want too many very slow open connections to the same IP.
e111g
e111g12mo ago
HTTP requests to a Worker are load-balanced at the TCP level, so if you are making ~20-30 simultaneous connections from the same NodeJS script, many of those connections could get routed to the same worker instance (you can check this by generating some unique ID in your worker at runtime and appending it as a header to your responses). Each Worker instance is limited to 6 outbound connections, and any additional connections are queued up (transparently to you) until one of those 6 spots becomes empty. If your client is downloading file slowly, only 6 files will be (slowly) downloaded in parallel, and the other 15-25 connections are sitting idle. Perhaps those idle connections trip a wire somewhere in the net stack and get canceled as they have no observable aliveness.