Crawlee & Apify•8mo ago

crawlee.run only scrap the first URL

Hi my problem is crawler.run(['https://keepa.com/#!product/4-B07GS6ZB7T', 'https://keepa.com/#!product/4-B0BZSWWK48']) only scrap the first URL I think this is because crawlee think they are the same URL , if i replace the "#" with a "?" it works , is there any way to make it work with url like this ?

Keepa.com - Amazon Price Tracker

Amazon price history charts, price drop alerts, price watches, daily drops and browser extensions.

3 Replies

Hall•8mo ago

View post on community site

This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.

Apify Community

fascinating-indigo•8mo ago

Hi @FoudreTower The #! fragment are used for client-side navigation only. So the crawler sees these as duplicates. When you you change it for ? its no longer the hashtag fragment and the crawlee takes the whole url when deduping. One way around this would be to to add uniqueKey when enqueuing.

provincial-silverOP•8mo ago

thanks @Lukas Celnar it works with uniquekey

Gaming

Programming

crawlee.run only scrap the first URL

Did you find this page helpful?