CA
xenial-black
What’s the fastest?
Hey all! I’m building a web scraper for leads and I need it to click on a button and unclick the button as fast as possible while also using some basic html scraping.
My question is, what is the fastest web scraping library im currently using selenium and beautiful soup. Should I switch to JS instead? Any help would be great thanks!
5 Replies
absent-sapphire•2y ago
hey @mikepowers, node playwright is good. Though, there are other ways to scrape much faster. Depends on your requirement
continuing-cyan•2y ago
The speed will be mainly dependent on if you need to use browser or not. If not, then it is just so much faster. Otherwise it doesn't matter what tool you use. see https://docs.apify.com/academy/api-scraping
API scraping | Academy | Apify Documentation
Learn all about how the professionals scrape various types of APIs with various configurations, parameters, and requirements.
xenial-blackOP•2y ago
@anon_@Lukas KrivkaI need to open a browser and click on a button to get information, scrape that information and close that button, for like 100-1k buttons, with like only 20 buttons per page, right now the 1k ones take about 30 mins and i would like something a bit faster if i can
If you really need to use a browser you could try using pkaywright and implement async in python. Or switch to node probably you will get a speed boost. But using a browser is always a slow solution.
But from your description I'm not sure you need a browser. Have you investigated what happens when the button is clicked? Maybe the site sends a request to the server at that point and you can emulate that request. Maybe the data is being loaded earlier?
It's likely that you're using a browser when you can get around with a faster, cheaper solution.
absent-sapphire•2y ago
@mikepowers , I agree with @Mantisus , check the requests being made and replicate them. It would be much faster than any browser