Attempt failed due to internal workflows error

Hey guys, I'm seeing a crazy number of random "Attempt failed due to internal workflows error" effects in various different Workflow steps. How can i get further info as to whats causing it? I've trawled the logs but can see nothing that would indicate anything abnormal.
16 Replies
avenceslau
avenceslau•2mo ago
Hey, can you provide some more info: - Is this something recent or that has been happening for awhile? - Does it happen on every kind of step or is there some pattern (steps run for a similar amount of time)? Also if you can please run the /link command so that I can see your accountId
BlockMinister
BlockMinisterOP•2mo ago
/link this has been happening today, since ive been ramping up usage it seems happens on steps that run for varying amounts of time. but generally ion ones that run longer than a minute or so... https://dash.cloudflare.com/c7d39e916329aa3713ce11211af021cd/workers/workflows/analysis-workflow/instance/ae9b5d95-f14c-497d-86a1-856bf452fcad the step with a 5 retries has a couple of these issues This is an example of another type of step with the same issue... https://dash.cloudflare.com/c7d39e916329aa3713ce11211af021cd/workers/workflows/analysis-workflow/instance/f31a079f-cdf7-4b6d-b710-946bb055ad15 and another here... https://dash.cloudflare.com/c7d39e916329aa3713ce11211af021cd/workers/workflows/analysis-workflow/instance/1b889d0f-9355-4136-8e9e-ccc843ea244e thanks for you looking into this... ive checked the logs as far as i can tell and there is no pattern i can see
avenceslau
avenceslau•2mo ago
Okay thank you, I will investigate. If I need anything else I will let you know
.r20
.r20•2mo ago
Yeah I got this issue as well
Ayan Patra
Ayan Patra•2mo ago
hi was getting the same error yesterday. WorkflowInternalError: Attempt failed due to internal workflows error But today not happening. Is there anything to take precautions to disable retry and overcome the error.
No description
No description
Ayan Patra
Ayan Patra•2mo ago
Even getting now
avenceslau
avenceslau•2mo ago
We have shipped a fix that will decrease the number of Attempts failed due to internal errors
BlockMinister
BlockMinisterOP•2mo ago
Definitely seem to be seeing less of those errors now. so thank you.
Oscar🏖
Oscar🏖•2mo ago
We're still seeing quite a few of these today, a slight decrease from a couple days ago, but still quite a lot
Olga Silva
Olga Silva•2mo ago
Hey, thank you for reporting. Can you provide more information on this? - does this happen in only one specific step from your workflow? - can you share the instance id of one instance where this is still happening?
Oscar🏖
Oscar🏖•4w ago
Hi @Olga Silva, apologies for the late reply. Answering your questions: - We're not seeing any correlation to specific steps. - This is an example of an instance ID where it happened: ea73b90b-3796-4224-ba4d-3be9dcd4fe8e - After observing this for a few days, we've noticed that errors spike when we deploy new versions of our Worker, even when these new versions don't modify any code having to do with the Workflows. Hi @Olga Silva, sorry for insisting, but any updates on this? It's happening very consistently in production and we're starting to consider moving away from Workflows as about 90% of our AI offering relies on them.
Olga Silva
Olga Silva•4w ago
Hey, sorry for the late reply. Are you only seeing this when you're deploying your Worker? Because each time you re-deploy your Worker, your associated Workflow will be redeployed and your running workflow instances at that time will encounter these errors
Oscar🏖
Oscar🏖•4w ago
We're seeing this consistently, even when not deploying. We do see the spike on deploys, which makes sense with your explanation even if not ideal. We're now trying to isolate the error further, I'll keep you posted As a question here, how does it make sense for existing Workflows to fail on every redeploy? For instance, if I have a Workflow waiting or sleeping for 365 days, does it mean that any deploy during that year will break it?
Olga Silva
Olga Silva•4w ago
Do you mind sharing the dash link of a workflow instance where this is happening please? @Oscar🏖 They won't fail on every redeploy. It depends on what your workflow is doing. If it is waiting for an event or sleeping you won't get these internal errors. Specially if you don't change the Workflow code. To be more clear if your workflow is in a Waiting state, new deploys won't affect.
Oscar🏖
Oscar🏖•4w ago
Ah sounds reasonable then 🙂 https://dash.cloudflare.com/34c0d8412965b33f782ad043d255542c/workers/workflows/PipelineV2Runner-production/instance/2e13abbb-6e66-4f53-9e56-84d5e61cc1c9 We're still trying to understand what's going on here, because the "Attempt failed due to internal workflows error" seems to be Cloudflare's fault, but the bad handling of it causing the entire Workflow to fail with "The execution of the Workflow instance was terminated, as a step threw an NonRetryableError and it was not handled" seems to be ours
Olga Silva
Olga Silva•4w ago
@Oscar🏖 I sent you a DM so we can help you further!

Did you find this page helpful?