Architecturally would it be possible to implement Cap'n Proto as a data type for D1? Spanner recently released protocol buffer support (apparently was available for years in their internal build) and it unlocks some really cool features like type safety and 10x faster performance vs storing nested data in JSON/JSONB. Compared to Spanner's approach, Cap'n Proto seems like it would be significantly better too since it's zero-copy. Is this something that's been considered or technically feasible with D1's architecture?
D1 is a more managed database over using SQLite Durable Objects directly, so it's great when you can shard your application data model across several of them. Sweet spot is a few tens so not exactly per user but not limited by the 10GB limit either, if you need more than that in the thousands you are better to use DOs directly.
Imagine apps like the trendy vibe coding platforms nowadays where you need a database for millions of independent projects and each one is self-contained and small, just as an example where it fits great. Or internal projects that fit within 10GB (or N multiples of them where sharding comes naturally) and the traffic is not crazy.
Well, social media is broad. If you are building Facebook, no. If you are building your own version of a small social media platform you could involve D1 in certain aspects for sure, e.g. for things that are private to a user. But then it also depends no usage, so just by a 2-word description it's not obvious. Again, my guideline rule is that if you are fine with 0-100GB of total storage and requests are not tens of thousands per second, D1 can work. Otherwise, if sharding is natural to your app, SQLite DOs, otherwise Hyperdrive + any external traditional database.
Did something change in how Cloudflare is calculating D1 DB size. All of the sudden all databases are half of the size that they were in the past. Any insights into what is happening? Thank you
Which API exactly? I checked your known accounts and don't see any drop in total size from our analytics. Please DM me more details instead of publicly here.
Any details to help any investigation? account/database IDs, timeranges? There isn't any ongoing D1 service incident. There was a transient network blip a few hours ago, but not consistent and only in some locations.
where app1 and app2 are two separate workers with separate wrangler files that share the same db instance and generally share the same worker logic implemented in packages/worker. can i somehow point the local db to the same db? or are they forced to be generated in their own local .wrangler folder?
Hey, yesterday, during our burst load, we received 153 "D1_ERROR: D1 DB is overloaded. Too many requests queued." on our d1 db while serving about 161k queries for ~30 minutes. It's a lot of queries and it seems like we managed to just stand of the edge of a single D1 db capacity. Do you guys at Cloudflare see more statistics about databases hidden to us users that ought to be interesting?
Hi! On a internal computer of our company, we would like to export our production D1 database every 10 minutes ( from the CLI wrangler d1 export ). Is someone from the CF staff can confirm us it's tolerated and this will not be blocked or against the CF policies please?
There is no policy against doing that, but keep in mind that whle exporting your database is blocked from serving any queries, so doing it every 10 minutes, especially if it's more than a few hundred MBs, it's not going to be a nice experience to your users.
Every 10 minutes seems excessive since you can just read the database directly or use time travel for restores between your backup periods. Is there something in the use case that requires that frequency?
Tricky question. To get scale, what many people on here do is they switch from using D1 to using Durable Objects which allow you to spit the work and data storage up across many instances. However, you said "highly relational data" and that's more complex when your relationships are on the other side of a network hop. There are strategies for doing that like the Actors model (and Cloudflare has an Actor base class for Durable Object now). Even so, I would only recommend that if there is a relatively clean way to split up the work/storage (per user, per tenant, etc.). So for consistently "highly relational data", your best choice might be to use an external scale-up style database solution. Cloudflare uses Postgres for some of its own internal needs so there is good support for doing that. Check out Cloudflare's Hyperdrive offering if you think that's the way to go for you: https://developers.cloudflare.com/hyperdrive/
Hyperdrive is a service that accelerates queries you make to existing databases, making it faster to access your data from across the globe from Cloudflare Workers, irrespective of your users' location.
Thank you very much, this makes a lot of sense, I was actually looking at the exact same page you linked before seeing your message, and indeed it perfectly fits our needs as we have a postgres database already and it is almost impossible to break up the data model to an actor model since data model is tightly coupled. However, I have two questions:
1. Will cloudflare support higher DO or D1 in a pay per usage model similar to something like R2 in the future? 2. Does hyperdrive cache the transaction queries (select statements in a transaction) or is it just caching select statements outside transactions?