We’ve been using Cloudflare D1 pretty successfully so far on one of our projects. In that setup, we mostly do single inserts and single reads from Workers. We do occasionally see transient “network connection lost” errors, but we’re doing hundreds of thousands of inserts per day and only see on the order of ~100 errors daily, so overall the success rate is very high.
More recently, we started a different project where we’re doing bulk inserts using the SQL import functionality, and we’ve enabled read replication. (For what it’s worth, read replication is also enabled on the earlier project.)
In this new setup, however, the SQL imports are failing almost consistently. The success rate feels like the inverse of the first project — maybe ~1% of imports get through, and the rest fail. We have a retry mechanism with linear backoff on our side, and we’re trying to import SQL files of around 400–500 MB, but we keep hitting a “network error / network connection lost” error. I understand this is generally a transient error, but in our case the failure rate is nearly 100%. Even when trying significantly smaller import sizes, the behavior is the same.
To rule things out, we also tested the same imports without read replication enabled, and in that case everything seemed to work fine. That’s why it looks like the issue is specifically related to the read replication feature.
I realize that read replication is still in beta, but we were expecting it to be usable to some degree. In its current state, this combination (read replication + SQL import) is basically unusable for us.
Any help, insight, or guidance would be greatly appreciated, especially if there’s a known limitation here or a rough timeline for when this is expected to work more reliably. Thanks in advance!