Hi Supabase team, we’re experiencing a production incident and need urgent help.
Yesterday and again today, our customers were unable to access the app.
All Project Status components (Database, Auth, PostgREST, Realtime, Storage, Edge Functions) showed as UNHEALTHY.
Yesterday:
- Restarting the database resolved the issue.
Today:
- Database restart did NOT help.
- Issue was resolved only after upgrading compute size.
This feels like we’re hitting a hard resource / connection limit that causes cascading failures across services.
Questions:
1. Is this expected behavior when the database exhausts connections or CPU?
2. Why would all services become unhealthy instead of partial degradation?
3. How can we detect this earlier or prevent it (alerts, limits, pooler settings)?
4. Are there logs or metrics we should inspect to confirm root cause?
This is a live production app with customers affected.