python realtime silent timeout issue
I have a long-running, asynchronous Python service (using
asyncio
) that runs inside a Docker container. Its primary job is to listen for database changes.
Implementation:
My implementation follows this pattern:
1. On service startup, I create a single, global AsyncClient
instance using await acreate_client(...)
.
2. I then immediately subscribe to INSERT
and UPDATE
events on a specific table using the Realtime client:
3. The callback functions (my_async_..._handler
) are async
and use asyncio.create_task
to process events without blocking the listener.
Problem:
The Realtime connection works perfectly for a while, but eventually (often after several hours or a day) it seems to silently time out or disconnect. The service doesn't crash and no exceptions are raised, but it simply stops receiving any new database change events.
Question:
What is the recommended, robust pattern for handling these silent timeouts with supabase-py
's async Realtime client? Is there a built-in auto-reconnect mechanism that I'm not implementing correctly, or what would be the idiomatic solution for a heartbeat/health check and resubscribe logic to ensure the connection stays alive?1 Reply
Quick update for anyone else who might be running into this issue: It's a bug in the realtime-py library.
The issue stems from an architectural race condition in the
realtime-py
library, which is perfectly described in this GitHub issue:
Bug: channel.subscribe()
returns before subscription is confirmed, creating race conditions
In summary: The await channel.subscribe()
call returns immediately, before the server has actually confirmed the subscription. This, combined with a related bug where the internal timeout timer wasn't being cancelled on a successful reply, caused our service to receive a TIMED_OUT
event about 10 seconds after every seemingly successful connection, triggering a constant reconnect loop.
The final, stable solution involved two parts:
1. Implementing an "intelligent supervisor" pattern that correctly waits for the actual subscription status from the on_subscribe
callback, rather than trusting the await subscribe()
call.
2. Applying a small "monkey patch" at startup to fix the library's internal timer bug directly.
Posting this here in case it helps anyone else who runs into similar unexpected timeout issues with the Python client. The connection is now perfectly stable.