crawlee-js
apify-platform
crawlee-python
💻hire-freelancers
🚀actor-promotion
💫feature-request
💻devs-and-apify
🗣general-chat
🎁giveaways
programming-memes
🌐apify-announcements
🕷crawlee-announcements
👥community
How to add headers to `addRequests`
File download causes: waiting until "load" error
<a href='page.html'>link</a>
everything works fine but if it's <a href='image.png'>link</a>
I get this error:
```
ERROR PlaywrightCrawler: Request failed and reached maximum retries. page.goto: net::ERR_ABORTED at https://mysite.com/?attachment_id=24365
=========================== logs ===========================...isTaskReadyFunction failing randomly
adding other libraries
scraping at scale
How to store array of objects in the same json file?
How to launch playwrightcrawler inside basiccrawler?
chromium crashes on Docker on Mac M1
The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested 0.0s
...Playwright Crawler fails on undefined page
createPlaywrightRouter()
function to create a router and pass it to the requestHandler
of the PlaywrightCrawler
.
All seems well, and according to typescript, I should be able to access a page
object in the handler. (I'm only using the addDefaultHandler
) However, when I run the actor on the Apify platform it fails with the following exception:
```2023-02-09T15:16:45.925Z INFO PlaywrightCrawler: Start of default handler ...Running default Playwright example on Docker (arm64)
actor-node-playwright-chrome:16
image, but it fails. I'm on a M1/arm64 machine but have tried to force amd64 with the same result.
...How to manually pass datasets, sessions, cookies, proxies between Requests?
Handle a 401 in errorHandler by detecting login form and gracefully continuing if present
requestHandler
, logging in, and then continuing with the crawl.
Recently we were asked to support "logging in" to a simple password protection screen on a Netlify site....
'undefined' in DataSet it is keeping me from exporting data
Persist Puppeteer tab with page.goto
page.goto
in a Puppeteer Crawler handler, a new browser is opened / previous one is closed which prevents me from preservering the session. How can I make sure that the same tab is being used when I do page.goto
?How to increase max memory?
Stop `keepAlive` crawler after all requests are finished
keepAlive: true
(call it A) and the second running normally (call it B), which adds request to the crawler A. After the cralwer B finishes I'd like to keep the crawler A running until it finishes all the requests and then stop the script. I tried the teardown
method but it stops the crawler without finishing the queue....Need help bypassing CF 403 Blocked
Can I use modules inside the `evaluate`?
evaluateAll
function?
For example:
```ts
const Foo = {...Scrape Monthly Listeners data from Spotify page

Trying to extend Dockerfile, can't install using apt-get getting permission denied
apt-get install
2) using sudo says there is no sudo (so I should already be root?)...