Crawlee & Apify

CA

Crawlee & Apify

This is the official developer community of Apify and Crawlee.

Join

crawlee-js

apify-platform

crawlee-python

💻hire-freelancers

🚀actor-promotion

💫feature-request

💻devs-and-apify

🗣general-chat

🎁giveaways

programming-memes

🌐apify-announcements

🕷crawlee-announcements

👥community

i'm looking foward for an api that

i'm looking foward for an api that allows me to get all comments from a post in instagram ( public ), there's one already but has max of 50 comments per post

Hi there, any advice to scrape Twitter (

Hi there, any advice to scrape Twitter (X) trends for a specific location? No actors in the marketplace seems able to collect trends. Thanks for any advice.

const jsdom = require("jsdom");

const jsdom = require("jsdom"); const { JSDOM } = jsdom; // Create a new JSDOM instance const cookieJar = new jsdom.CookieJar(undefined, {...

Dm me

Hi! I am interested in YouTube Scrapper tool. What I am looking for is to input a keyword or give filters like subscribers, views etc ...

I want pass cloudflare with jsdom module

I want pass cloudflare with jsdom module in nodejs. not BYPASS!!!! Who can help me?...

Not really a feature request for apify

Not really a feature request for apify itself but for this discord - a general chat for any type of subjects/unarchive #🗨apify-chat

Hi I’m a real newbie here. I’d like to

Hi I’m a real newbie here. I’d like to know how i can scrap INstagram followers and # for a list of accounts to keep track on competitors. Would you know how i could do that? Thanks very much!

Hi I’ve tried using Instagram reel

Hi I’ve tried using Instagram reel scraper but can’t see downloaded reels (eg mp4 files). I think the scraper doesn’t download the data. Pls could someone point me towards an Apify tool that does this? Many thanks!

Help

I want to pass cloudflair using only nodejs. Who can help me?...

Hi, is there a way to favorite Actors?

Hi, is there a way to favorite Actors?

Adding GPU computing to the platform

Adding GPU computing to the platform would be useful

Pydantic data validations at the user

Pydantic data validations at the user level would also be great to have

I am a media buyer and I recently

I am a media buyer and I recently learned about the possibility of extracting data from Facebook ads posts In my research I came across the facebook ads scraper on Apify. However I am struggling to understand how to use it effectively. Specifically I am interested in retrieving the interests used in the active ads published by a Facebook page could you please provide guidance on how to achieve this?? Thanks in advance...

Hey everyone, would love it if we had a

Hey everyone, would love it if we had a way to shutdown crawlers from inside request handler, I went through the docs today and the only way to do it right now is via the crawler itself either using crawler.teardown() or crawler.requestQueue.drop() (Not sure about this one), the main use case for it being saving on proxy costs/stopping crawlers from redundantly scraping data or even some other arbritrary conditions. I have found a workaround for this by setting a shutdown flag in a state or even a variable and checking for it inside the handlers and if its true, just doing a return;(to empty out the queue) while this works it does add in a lot of noise in logs (also in the code) because we need to log that we are skipping them because of this flag for debugging purposes and I wish it would be handled a little more gracefully in the scraper instead of every request handler checking for it...

Hi everyone.

Hi everyone. Currently on Apify, users of an actor can communicate with the actor's developer via issues. However, what if the actor's developer wants to make an announcement regarding a feature update or changes? Currently, there isn't any straightforward way to do this, and it would be a nice-to-have feature from a developers point of view. Thanks...

Currently i'm running the Google Crawler

Currently i'm running the Google Crawler, i'm at 60.000 requests and noticed that there is a Search Term in the list I want to skip. Currently (as far as i could tell) there is no way to stop the run, edit settings, resurrect the run. In addition, i also can't stop the run, edit settings, and start a new run with a setting like 'Don't crawl pages already crawled in run #x'. Hence leaving me with only 2 options, stop the run and start again (costly) letting it run with the unwanted term (costly as well). [Add an option, to save all crawled urls of an actor -on a central place- and adding the setting 'don't run those urls again' would really be a huge improvement in cases like this Also in cases where the Actor can't crawl a whole country at once (e.g. per city), you unavoidably crawl duplicate urls (overlap between cities) in each crawl (costly, in both $ and time); the function above will also be a great improvement for those cases....

Having the ability to test new settings

Having the ability to test new settings, e.g. jQuery settings set in the 'Page function' without the need to do a new run. Testing based on the html of the previous run. I make so+me jQuery changes, i need a full run to test (i'm now well over 100 runs to test a bit of code) takes a lot of time. Live testing would be great. Think jsfiddle, based on the source of the last run....

hi can you add a specific date like i

hi can you add a specific date like i want to scrape in group where i search and after search what comes i want to scrape

Would be great if the 'Page function'

Would be great if the 'Page function' would have versioning.

Adding the ability, globally, to name '

Adding the ability, globally, to name 'runs'. E.g. if i run Google or Yelp, i might do a run for restaurants and a run for hotels. Would help administratively to see what runs i performed and find the correct run again.