python realtime silent timeout issue

I have a long-running, asynchronous Python service (using asyncio) that runs inside a Docker container. Its primary job is to listen for database changes. Implementation: My implementation follows this pattern: 1. On service startup, I create a single, global AsyncClient instance using await acreate_client(...). 2. I then immediately subscribe to INSERT and UPDATE events on a specific table using the Realtime client:
# In my main async function:
supabase_channel = supabase.realtime.channel("my_table_changes")
await (
supabase_channel
.on_postgres_changes(event="INSERT", ..., callback=my_async_insert_handler)
.on_postgres_changes(event="UPDATE", ..., callback=my_async_update_handler)
.subscribe()
)
# The service then waits indefinitely on an asyncio.Event for a shutdown signal.

# In my main async function:
supabase_channel = supabase.realtime.channel("my_table_changes")
await (
supabase_channel
.on_postgres_changes(event="INSERT", ..., callback=my_async_insert_handler)
.on_postgres_changes(event="UPDATE", ..., callback=my_async_update_handler)
.subscribe()
)
# The service then waits indefinitely on an asyncio.Event for a shutdown signal.

3. The callback functions (my_async_..._handler) are async and use asyncio.create_task to process events without blocking the listener. Problem: The Realtime connection works perfectly for a while, but eventually (often after several hours or a day) it seems to silently time out or disconnect. The service doesn't crash and no exceptions are raised, but it simply stops receiving any new database change events. Question: What is the recommended, robust pattern for handling these silent timeouts with supabase-py's async Realtime client? Is there a built-in auto-reconnect mechanism that I'm not implementing correctly, or what would be the idiomatic solution for a heartbeat/health check and resubscribe logic to ensure the connection stays alive?
1 Reply
motivated
motivatedOP2w ago
Quick update for anyone else who might be running into this issue: It's a bug in the realtime-py library. The issue stems from an architectural race condition in the realtime-py library, which is perfectly described in this GitHub issue: Bug: channel.subscribe() returns before subscription is confirmed, creating race conditions In summary: The await channel.subscribe() call returns immediately, before the server has actually confirmed the subscription. This, combined with a related bug where the internal timeout timer wasn't being cancelled on a successful reply, caused our service to receive a TIMED_OUT event about 10 seconds after every seemingly successful connection, triggering a constant reconnect loop. The final, stable solution involved two parts: 1. Implementing an "intelligent supervisor" pattern that correctly waits for the actual subscription status from the on_subscribe callback, rather than trusting the await subscribe() call. 2. Applying a small "monkey patch" at startup to fix the library's internal timer bug directly. Posting this here in case it helps anyone else who runs into similar unexpected timeout issues with the Python client. The connection is now perfectly stable.

Did you find this page helpful?