only cookies in scrape results
I'm encountering challenges with cookies while scraping and crawling various websites. For many websites, instead of retrieving the actual content, I am only able to extract cookie-related information or consent texts, without any meaningful content. This significantly affects the effectiveness of the scraping process.
Following previous advice (as discussed in this Discord thread https://discord.com/channels/1226707384710332458/1226707384710332465/1251760606504288388), I have excluded only the most apparent tags associated with cookie prompts. Unfortunately, this approach has not been entirely effective, as cookie prompts still obscure much of the main content. Interestingly, when I test the scraping on Playground, I achieve excellent results on the same websites.
If there’s a more recent solution or method for effectively handling these cookie prompts, I’d be eager to learn more about it.
I've attached an example of a website along with the corresponding scraping results to demonstrate the issue. Any guidance or alternative strategies to bypass these cookie prompts and achieve consistent content extraction would be greatly appreciated.
Thank you very much for your help!
https://theoceanpackage.com/
The Ocean Package
Die Mehrwegversandverpackung von morgen
Erleben Sie die Zukunft mit der günstigsten Mehrwegversandverpackung. Erfahren Sie mehr über nachhaltige Verpackungen!
0 Replies