I’m reaching out regarding intermittent stability issues when using Supabase Postgres as the metadata database (metastore) for Apache Airflow running in Docker on GitHub Actions.
Environment / setup
Platform: GitHub Actions runner (Linux) running Docker Compose
Application: Apache Airflow (LocalExecutor) with separate scheduler + webserver containers
Supabase Postgres connection: using the Supabase pooler endpoint on port 5432 (SSL enabled)
Connection string pattern:
postgresql+psycopg2://:@:5432/?sslmode=require&connect_timeout=15&keepalives=1&keepalives_idle=30&keepalives_interval=10&keepalives_count=5&application_name=airflow_ci
We also attempted to reduce runtime DNS dependence by resolving the hostname once on the host and passing hostaddr=<resolved_ipv4> to libpq in the connection string.
What we’re seeing
When Airflow runs notebook tasks in smaller concurrency groups (approximately 5–6 tasks at a time), the full CI run completes successfully.
When we run a larger group size (e.g., ~14 concurrent notebooks/tasks), the run becomes unstable and fails intermittently with DNS not resolved issues.
The failures manifest as connection instability / network name resolution failures within the CI job. Separately, we also observe external hostname resolution failures for Snowflake/Azure endpoints, but the overall issue appears correlated with using Supabase as the Airflow metastore in CI and higher concurrency.
Why we suspect metastore contention / connection instability
The behavior is strongly dependent on concurrency (stable at ~5–6, unstable at ~14).
Reducing scheduler parallelism and task concurrency significantly improves stability, but increases total runtime.
The same GitHub Actions workflow is materially more stable when we are not using Supabase as the Airflow metastore.