crawlee-js
apify-platform
crawlee-python
💻hire-freelancers
🚀actor-promotion
💫feature-request
💻devs-and-apify
🗣general-chat
🎁giveaways
programming-memes
🌐apify-announcements
🕷crawlee-announcements
👥community
Easiest way to scrape a JSON URL?
General way to scrape blogs, articles and content?
How to disable storage directory creation?
persistStorage
configuration option to false
(it was mentioned in some discussion on GitHub), but it has no effect. Also I tried to set defaultKeyValueStoreId
, defaultRequestQueueId
and defaultDatasetId
for each crawler. I thought that I will get separate directory for each crawler, but Crawlee creates storage/key_value_stores/default
, storage/request_queues/default
directories.
NodeJS version: 18.13.0
Crawlee version: 3.1.4...File format issue
where to hook 'Puppeteer request interceptor'
querying data-set on filesystem - like SQL
./storage/datasets/
products
categories
...failedRequestHandler, error argument, detailed error message lost
Using requestsFromUrl is throwing an Error

Firefox, PlaywrightCrawler, SSL_ERROR_BAD_CERT_DOMAIN error
example of manually adding requests to requestQueue
scraping different website strutures
PlaywrightCrawler.requestHandler: Error: mouse.move: Target page, context or browser has been closed
mouse.move: Target page, context or browser has been closed
Here the sequence of calls:
```...From page to End page pagination
Recover endless pagination items by clicking on showMore button
which ec2 instance type is best suited for crawling?
Cant input in Google
Cannot use import statement outside a module
How can I get my data to be scrapper faster?
Crawlee not working(?) on a page with shadow dom