Crawl Taobao, 1688

Hi everyone! My name is Giang and I am a fresher developer. Now I am build a web order from taobao, 1688, tmall,... But I have a big problem when crawl taobao. I just crawl 10 or 15 product before block of antiscraping. I think if i use proxy the problem is must login see product item and I have try to use cookie but i think if login in many ip it easy to block my account. If anyone has experience scraping Tmall/Taobao and could offer some advice or help that would be hugely helpful. Thanks!
No description
1 Reply
afraid-scarlet
afraid-scarlet2y ago
Dealing with blocks can be quite challenging, as every website and project is distinct, and there's no universal solution that works flawlessly Check out this link (You may find some tips how to bypass protection): https://docs.apify.com/academy/anti-scraping Also, if you scrape under login. Try to use low concurrency (10-20 maybe?): https://crawlee.dev/api/cheerio-crawler/interface/CheerioCrawlerOptions#maxConcurrency or maxRequestsPerMinute option (try to experimnet with the value. Maybe 80-100 ?): https://crawlee.dev/api/cheerio-crawler/interface/CheerioCrawlerOptions#maxRequestsPerMinute

Did you find this page helpful?