How can I use the Playwright Crawler and BeautifulSoup Crawler in the same Actor?

This is so that Playwright can fill in and submit a website search page which uses dynamic Javascript. When the results are shown I want to be able to use the BeautifulSoup crawler to open each product page and parse the information. If I use Playwright to open each product page, this takes a very long time. I cannot seem to run both Crawlers at the same time.
8 Replies
Hall
Hall•8mo ago
View post on community site
This post has been pushed to the community knowledgebase. Any replies in this thread will be synced to the community site.
Apify Community
rare-sapphire
rare-sapphire•8mo ago
GitHub
Running different requests with different crawlers? · apify crawlee...
I'm trying to solve a situation where I want to make the initial request with a plain crawler (because it's an API or something), but continue with subsequent requests to detail pages with ...
rare-sapphire
rare-sapphire•8mo ago
The link contains the answer
deep-jade
deep-jade•8mo ago
I want to build my own actor with playwright and BeautifulSoup. I am looking for this exactly solution. first I want to send a Http request and get the HTML and use the beautifulSoup to parse the data and then open the Links (get from parsing the data) using playwrights . correct me If I am wrong. first use the python with beautifulSoup and get the results and use those result with Playwright. so we have to create and build 2 different actor for this ?
Mantisus
Mantisus•8mo ago
Hi @Abdul The discussion above concerns the use of - crawlee-python In Actor, you can implement the use of Http client + BeautifulSoup and Playwright, either within a single Actor or using a bundle of two Actors.
MEE6
MEE6•8mo ago
@Mantisus just advanced to level 4! Thanks for your contributions! 🎉
deep-jade
deep-jade•8mo ago
Thanks for clarifying it. do you have anything that will be helpful for me to start working on actor with HTTP client + BeautifulSoup + PlayWright
Mantisus
Mantisus•8mo ago
No, I don't have any code samples like that. Since I don't usually use Playwright and browser automation. But writing such an Actor is not much different from just writing a scrapper using such a bundle. Refer to the official documentation to see Playwright instantiation in Actor - https://docs.apify.com/sdk/python/docs/guides/playwright Add on top of HTTP Client + BeautifulSoup integration will not be a problem.
Using Playwright | SDK for Python | Apify Documentation
Playwright is a tool for web automation and testing that can also be used for web scraping.

Did you find this page helpful?