Can't retrieve results from a crawl job

client.get_crawl_status(crawl_job_id) is stuck since 25 minutes. Also tried to download the results from the website UI but also seems to be stuck although job is marked as completed. JOB_ID = 019acae3-c1ea-712d-a07f-8f0bdd3e127f
3 Replies
Gaurav Chadha
Gaurav Chadha2w ago
Hi @Romain, I checked the JOB ID details from backend. The Python SDK by default auto-paginates through ALL results. Your crawl has ~3000+ documents, which means it makes ~30+ sequential HTTP requests to fetch everything. This is why client.get_crawl_status() appears stuck - it's slowly fetching all pages. As a temporary fix you may setting auto_paginate=False
Romain
RomainOP2w ago
Thanks for the answer. It has been running for 100minutes now and still no results. I really need to get all results. I will try the auto_paginate=False on Monday after the weekend. I hope the job results will still be available by then Also, I have noticed that I should have set ignore_query_parameters=True (by default it sets it to False) because most of the URLs scraped are irrelevant and I lost my 3k credits due to this...
Gaurav Chadha
Gaurav Chadha2w ago
Oh, the job is already failed, could you please share this via app chat, I can re-add the used credits for this failed job.

Did you find this page helpful?