Workflows stuck QUEUED

We are seeing a handful of workflows occasionally be stuck in QUEUED. They never get out of this state. We have to manually restart them. Hasn't been an issue until about a week ago, now seeing this happen sporadically when we start workflows. Any idea what might be up?
No description
50 Replies
avenceslau
avenceslau•2w ago
We are investigating.
ajay1495
ajay1495OP•2w ago
Thank you
Caio
Caio•2w ago
Hey 👋 . We're looking into how your instances are not picked up internally. Can you run the /link command so that we can take a closer look? It would also help if you could provide an instance ID for one of the faulty instances. Sorry for the inconvenience
orangesbracelet
orangesbracelet•2w ago
also we facing an issue retrying doesnt work, even though hours passed than first run it seems stuck, it was working two or three days ago
No description
orangesbracelet
orangesbracelet•2w ago
and it is happening for every workflow
avenceslau
avenceslau•2w ago
Hi can you please link your account with /link
orangesbracelet
orangesbracelet•2w ago
okey but can you access isnt it private ?
avenceslau
avenceslau•2w ago
You need to do /link here on discord 😅 I can’t
orangesbracelet
orangesbracelet•2w ago
@avenceslau is there any update
avenceslau
avenceslau•2w ago
Yup I will have a quick look
orangesbracelet
orangesbracelet•2w ago
if it is needed I can provide other workflow links
avenceslau
avenceslau•2w ago
If you can send me link to a instance where that has happened.
orangesbracelet
orangesbracelet•2w ago
but I already shared with you that instance
avenceslau
avenceslau•2w ago
I think you deleted it. At least I don't see it
orangesbracelet
orangesbracelet•2w ago
I shared again now can you see I copied the url of instance and used with /link
avenceslau
avenceslau•2w ago
That does not work the way you think. Just paste it here
avenceslau
avenceslau•2w ago
Thanks will report back in a bit
orangesbracelet
orangesbracelet•2w ago
is there any update @avenceslau is there anything that should be done by us @avenceslau
avenceslau
avenceslau•2w ago
You have to give us some time to investigate. And please don't ping us. Hey, can you tell me roughly how many instances do you have running on this workflow?
orangesbracelet
orangesbracelet•2w ago
like max 30 for a day this problem still occurs, today just one instance runned instead of that still stuck retrying https://dash.cloudflare.com/d717a4f9813d81c0515ede7c76004bd1/workers/workflows/MeetingSummary/instance/GLFRuniao_de_Elegibilidade_do_Visto_L1_41hsu1
avenceslau
avenceslau•2w ago
Hey I just DM'ed you can you please check?
ajay1495
ajay1495OP•7d ago
@avenceslau | Workflows @Caio we are continuing to see this issue across our workflows
No description
ajay1495
ajay1495OP•7d ago
It's having impact for our customers. We would appreciate any update you can provide, this is obviously very concerning for us and we really would like to stay on Cloudflare Workflows
avenceslau
avenceslau•7d ago
Sorry about that we are taking a look at what’s wrong
ajay1495
ajay1495OP•7d ago
Thank you. About half our customers were affected
avenceslau
avenceslau•7d ago
Which workflow is this instance from?
ajay1495
ajay1495OP•7d ago
https://dash.cloudflare.com/470d4729e23e8936fd2a8f6569770873/workers/workflows/poll-database-workflow/instance/f79fb9d5-bcdf-4663-b3cd-aafced457b97 Is another instance of this But what's really weird... is that as soon as I load the page. It's like it "notices" it was stuck and then resumes I assume you're able to pull the workflow from that url, but let me know if not.
ajay1495
ajay1495OP•7d ago
It was stuck in QUEUED from our logs for about a day. But then as soon as I loaded the status page there, it seemed to "wake up"
avenceslau
avenceslau•7d ago
Was this one "stuck" for a day? What about this one?
ajay1495
ajay1495OP•7d ago
Yes they were both stuck
ajay1495
ajay1495OP•7d ago
No description
ajay1495
ajay1495OP•7d ago
From our internal dashboard. For about 1 day it was stuck in that state (since yesterday about 11am est) For context: they are polling jobs (they poll, wait 1 minute, then recursively invoke). At the beginning when they start we log an an event in our db and when they restart we also log an event in our db. So that's how we're able to detect when they get stuck (we see a RESTART event with an instance ID, but we don't see a START event with that instnace ID)
avenceslau
avenceslau•7d ago
Has this happened with any instances from today (we rolled out a mitigation to this kinds of errors)
ajay1495
ajay1495OP•7d ago
I see one instance of this at 3am est today (13 hours ago)
No description
ajay1495
ajay1495OP•7d ago
No description
ajay1495
ajay1495OP•7d ago
https://dash.cloudflare.com/470d4729e23e8936fd2a8f6569770873/workers/workflows/poll-database-workflow/instance/449c661d-b99d-4aa9-8b00-5b491f8feed8 (which, as I noted before, now it started after I navigated to that page, after being stuck in QUEUED for 13 hours)
avenceslau
avenceslau•7d ago
We rolled out this change 5h ago, instances after that should not be affected
ajay1495
ajay1495OP•7d ago
Okay as in we shouldn't see any more workflow instances stuck in that QUEUED state?
avenceslau
avenceslau•7d ago
New ones, instances that have been created before today might still get affected
ajay1495
ajay1495OP•7d ago
Okay thank you. Will keep an eye on it and let you know if we see it again!
avenceslau
avenceslau•7d ago
Yup do let me know
ajay1495
ajay1495OP•7d ago
We are expecting a ton of traffic on Tuesday this coming week so hopefully we should be ironed out by then Thanks again @avenceslau | Workflows , nice to know you guys got our back!
ajay1495
ajay1495OP•5d ago
They were stuck in QUEUED for about 18 minutes each. As soon as we noticed them and navigated to the status pages, they started. You can check the background time and wall time to confirm (~18 minutes)
No description
avenceslau
avenceslau•5d ago
Just to keep you posted, we are investigating.
andrew
andrew•4d ago
great, thank you. (ajay and i work together)

Did you find this page helpful?