Refactoring ELT Pipeline with Effect and Addressing Back Pressure Issues
Streaming question... I'm currently refactoring part of a ELT pipeline with Effect. I'm reading from a high-throughput gRPC stream, doing some minor batching & aggregation & computation within the stream as well as tracking a cursor (to restart the stream after a failure with "exactly once" guarantees). That was all working fine already until I ran into some back pressure issues. I've now added a queue. Trying to verify whether my configuration makes sense.
My approach so far:
- Ingestion (Fiber #1):
1. Read from gRPC stream, filter & compute messages, track "local" cursor in a Ref for recoverable restarts
2. Handle ingestion stream errors here and restarts the gRPC stream on recoverable failures
3. Write processed messages (containing cursors) into queue (destined for storage)
- Storage (Fiber #2):
1. Read from shared queue
2. Batch write into storage and persist the cursor (transactional)
If Fiber #2 fails, restart the whole program from the last persisted cursor
So far I'm satisfied but wondering if I'm missing any idiomatic patterns or anything
My approach so far:
- Ingestion (Fiber #1):
1. Read from gRPC stream, filter & compute messages, track "local" cursor in a Ref for recoverable restarts
2. Handle ingestion stream errors here and restarts the gRPC stream on recoverable failures
3. Write processed messages (containing cursors) into queue (destined for storage)
- Storage (Fiber #2):
1. Read from shared queue
2. Batch write into storage and persist the cursor (transactional)
If Fiber #2 fails, restart the whole program from the last persisted cursor
So far I'm satisfied but wondering if I'm missing any idiomatic patterns or anything
