requestQueue doesn't delete requests after visiting and saving data
Hi, working with crawlee and playwright. I've noticed that requests aren't being popped out of the queue even though the links have already been visited and scraped. Am I missing a configuration or something?
10 Replies
xenial-blackOP•3y ago
The queue looks like this even after all of these requests have already been visited:
xenial-blackOP•3y ago

xenial-blackOP•3y ago
my default router ( I have one other router for the DETAILS request but it does not enqueue links) :
xenial-blackOP•3y ago

xenial-blackOP•3y ago
it also seems like it dumps the requests back to the queue after scraping...
xenial-blackOP•3y ago

xenial-blackOP•3y ago
follow up: I'm dumb and in one case there wasn't a <p> element so the selector wasn't finding anything which is why it was failing.
@tegra just advanced to level 1! Thanks for your contributions! 🎉
correct-apricot•3y ago
Its a feature, requests stored per-run to process unique URL just once, this way you can add all sublinks from web site without going into endless loop of scraping
xenial-blackOP•3y ago
thanks, it took me awhile but I read the documentation and realized that was the case!