Status: stuck
Any idea why this happens:
Error: Crawl job failed or was stopped. Status: stuck
7 Replies
I also run into error 408 from time to time
I also just started seeing "stuck" as a status for my crawl job. URL I'm trying to crawl is: https://nl.edu/undergraduate-college/programs/
Code is: crawl_url = "https://nl.edu/undergraduate-college/programs"
params = {
"crawlerOptions": {
"limit": 500,
"maxDepth": 5,
"ignoreSitemap": True,
"ignoreRobots": True,
},
"pageOptions": {
"onlyMainContent": True,
"parsePDF": True,
"removeTags": ["script", "style", "nav", "header", "footer",
".advertisement", ".sidebar", ".nav", ".menu",
"#comments", "img", "svg", "iframe", "video",
"audio"]
},
}
urls = []
job_id = app.crawl_url(crawl_url, params=params, wait_until_done=False)
National Louis University offers accessible, affordable, career-driven higher education. Our personalized programs are designed to advance your career, whether you're passionate about education, business, psychology, hospitality management or culinary arts. We believe everyone deserves an education that helps them reach their potential and achie...
Yep, I'm seeing it happen like 10%-15% during runs
Hey all! Apologies. We're about to push a fix that should fix this issue. We there was a bug with our queuing system that was caused by a rapid uptick in usage. It should be online within the next hour or so.
Thanks @Caleb! Although I'm still seeing this error quite frequently
The 408s, or status "stuck?
Status "stuck", I've added up to 4-5 manual retries each time this happens