Supabase•2mo ago

Supabase Intermittently Stops Working.

My project has been healthy and been active for months. Starting 2 days ago it's been intermittently becoming unresponsive and my frontend app API calls timeout. My project status shows that it's unhealthy. I've seen everything from Max CPU and Disk/I/O to timeouts. I've been optimizing queries and indexes to help. I've even vaccuumed some tables just in case. Opened two tickets with supabase support, but yet to hear back. My front-end traffic has been steady, but now my customers are getting pissed. Anyone experienced this and maybe solved it?

13 Replies

garyaustin•2mo ago

Probably going to take support looking at it. Did you get the auto reply at least from your support request? Do your logs give you any useful info? CPU/Disk charts? Query performance tables?

AOP•2mo ago

I got the automated email, but yet to hear from them. I just sent them an email. I'm watching all the logs and have done some optimizations on some queries, but it's becoming clearer and clearer that it's likely something outside my project causing this.

AOP•2mo ago

And now getting this even though I don't see any burst in traffic

garyaustin•2mo ago

Did you look to see if a large number of queries are being run in the Query performance tab?

AOP•2mo ago

Not much screaming out here. Slowest query is 892ms and it's not even from my app code. Seems like supabase system queries.

garyaustin•2mo ago

Your most frequent does not look very high either but check that.

AOP•2mo ago

The most frequent again are supabase controlled queries and taking negligible time. All the more why I feel it's something in the wider supabase platform. No mention of incidents affecting this so far. Weird

garyaustin•2mo ago

Unless the AWS server you are on is having issues... Those queries are not going to cause an issue assuming that is not a few minute trace. Also when you overload the database you would normally start getting timeout errors and not the services shutting down.

AOP•2mo ago

Thanks. I guess I'll wait to hear from supabase support and keep looking/trying things in the meantime. Thanks for your responses

garyaustin•2mo ago

I noticed another user mention cron running on a thread you commented on. Do you have any cron tasks?

AOP•2mo ago

Yes I do and I've disabled all the non-critical ones to see if it helps. Actually I looked at the cron tables and they don't have indexes. I wanted to add a couple of indexes, but got an error because I guess the cron schema is managed by supabase. Anyways, I'll keep looking around as I wait. Thanks

garyaustin•2mo ago

How fast are they running? Do you prune the cron run details table? It can grow very big and slow if not cleaned. https://github.com/citusdata/pg_cron?tab=readme-ov-file#viewing-job-run-details

AOP•2mo ago

Thanks. I already have a job to clean up cron.job_run_detailss a few times a day. Update: Took a couple of days for Supabase support to get in touch, but when they did they were helpful. I wish they could provide phone support. In my case there were a number of issues that contributed: 1. My cron jobs table had grown so big that the inserts were taking longer and longer. It also didn't help that the cron.jobs_run_details table doesn't have covering indexes. I've suggested to supabase if they could add at least two indexes and I'll wait to hear back. Deleting older job records helped, but not before contributing to high disk IO / Disk IO budget depletion. 2. We've since added a scheduled job to clean up older cron job run details periodically. 3. We had one cron job that was not running efficiently and supabase support helped us identify it based on EBS IO Balance charts they shared. We've since optimized queries and added covering indexes. So far so good and we will keep observing. 4. The issue for us was that due to a large cron jobs table and some inefficent queries, the disk IO budget was getting maxed out, CPU was getting maxed out and swap memory usage was increasing. 5. Lastly, there is no easy way to really accurately see how your actual disk IO usage compares with the EBS IO budget. The built-in supabase reports/dashboards showed us to be be well away from the limit even with some spikes, but supabase support has access to the underlying EBS IO balance and able to plot when the budget was running out. It would be great if this EBS IO budget could be made readily available in the built-in reports. 6. Consider adding Grafana to your monitoring (it helped us see the high swap usage) Overall, thanks for your support and hopefully this helps someone in the future.

Gaming

Programming

Supabase Intermittently Stops Working.

Did you find this page helpful?