Page.goto never resolves in headful (using XVFB) using `apify/actor-node-puppeteer-chrome` Docker

We are able to successful launch the chromium browser, but when navigating to certain pages, puppeteer.page.goto never resolves in a page load event (either load or any of the other events). We are not seeing this behavior when we run the same script (using chromium116, puppeteer21, and the latest version of crawlee) outside of a Docker container. We also don't see this behavior on the Apify platform using Actors, but currently can't use the service due to our security requirements. Happy to share more detail but would appreciate any ideas on where to look. Thanks!
2 Replies
Pepa J
Pepa J2y ago
Hi, @cookiemonster , Could be a lot of things. Firstly using the proxies gonna slow down the scraping a bit. There also might be some restriction or intended waiting when mostly Proxy datacenter IPs are used, so you have proxyConfiguration set? Are you running the Actor with decent ACTOR_MEMORY_MBYTES env variable being set?
helpful-purple
helpful-purpleOP2y ago
@Pepa J Thanks for the quick response. We are not using a proxyConfiguration as of now. And we are not running into this issue when using an Actor executing on Apify's services, but we are running into it when running a built apify/actor-node-puppeteer-chrome Docker image locally on M2 Macs with Rosetta emulation and also in Lambda. Our current hypothesis is that there is some memory leak, which is causing the container to run out of resources. So our next step is probably to try running the container cleanly on a EC2 instance to isolate the hardware variables

Did you find this page helpful?