Hi everyone, I'm seeing a similar issue
Hi everyone, I'm seeing a similar issue as @mau . I'm sending events via worker binding, but I don't see them landing in r2 (sink). Configuration is pretty close to default one, schema is pretty simple, sink is configured to use Parquet format with zstd compression. I've seen a couple of issues returned by the send() call (Unhandled error in RPC), but that doesn't explain why all (millions) events are missing. Pipeline dashboard shows 0B for Data In (Metrics tab), which doesn't seem right. I've used legacy pipelines and data was flowing through (although with some issue of it's own).
16 Replies
Sorry for the issues. If you post your pipeline id I can look into it.
Hey Micah, I've sent the pipeline id in the DM
Thanks, this is actually a different issue. We're investigating and will get back to you.
Hey @aleksatr we've deployed a fix for this bug and your pipeline should now be working
Hey @Micah | Data Platform , thanks, I see that events are flowing through! I assume I should retry sending the event when I get "Unhandled error in RPC"? Also, can I ask for an increase in the ingest rate for the corresponding stream? π
Yes, if you get an error like "uhandled error in RPC" (or any other 5xx error) you should retry. Events are not guaranteed to be durably committed until you receive a 200 back. We're working on improving the stability of the ingestion endpoint under load as well.
What total throughput would you be targetting for your stream?
Hi @Micah | Data Platform how high can we go? We're currently hitting the limit of 5MB/s and Q4 is just starting to ramp up. Also, we're expecting to grow significantly in the near future, so can we push this limit close to 100 MB/s?
We can bump you to 50 MB/s now β can you send me the stream id?
Thanks! I've sent the stream id in DM
Hey @aleksatr β we've upped the limit for that stream to 50MB/s
Thanks @Micah Wylde , it seems that something went wrong with this stream/pipeline around 6AM UTC, just a few files have been written to the R2 bucket since then and it completely stopped producing output after 8AM UTC. Can you check and see if there are any indications of what is going on internally?

Hey @aleksatr β the pipeline is back up and running. Sorry again for the issues. We've identified a bug related to how we write multipart uploads to r2 that the pipeline hit. We've recovered the pipeline in the meantime and are working on the actual fix, and we'll keep you updated.
thanks for looking into it
hey @Micah Wylde , I'm looking at "Events Dropped" chart in the "Metrics" tab of the pipeline UI, is there some way to observe why events are dropped? The tooltip says "Messages that failed deserialization", but is there a way to see see some logs that would indicate why some messages fail? I'm sending events from my worker to both CF Pipeline and a custom-built pipeline, and I'm observing that there are differences in event counts coming out of these two pipelines. I suspect that, at least in part, this difference can be explained by the dropped events.

We've deployed the fix for the issue (if you're curious, it's this: https://github.com/ArroyoSystems/arroyo/pull/954)
Not currently, but we know this is a huge need. Common reasons why messages fail to deserialize:
* requried fields that are missing in the data
* data that can't be deserialized into the desired type (like timestamps with invalid formats)
* numeric values that are too big for their type (like a number >2B in a int32)
We'll be rolling out two features in the near future to help with this: more useful logging to tell you why events are failing to deserialize, and a dead-letter queue so you can inspect and potentially reprocess failing events
Hi @Micah Wylde
My pipeline isnβt working again β could you help me troubleshoot it?