Crawlee & Apify

CA

Crawlee & Apify

This is the official developer community of Apify and Crawlee.

Join

crawlee-js

apify-platform

crawlee-python

💻hire-freelancers

🚀actor-promotion

💫feature-request

💻devs-and-apify

🗣general-chat

🎁giveaways

programming-memes

🌐apify-announcements

🕷crawlee-announcements

👥community

Expand clickable elements setting - Website Content Crawler

Hi there, I'm trying to scrape this website - https://www.msci.com/research-and-insights/, there's a load more button which I wish to click so that crawler extracts all the content. I tried this setting in different ways but failing. The css selector for that element would be #research-items-load-more a . I tried setting values like ["#research-items-load-more a"=\"true\"] or just ['#research-items-load-more a']. It fails to run eventually. Would appreciate quick help here....

IG Apify actor & Google Sheet integration

Hi! I’m looking to build an integration between Instragram Profile Scraper and Google Sheet. I’d like to make sure i can get the results amend to the Google Sheet every time the scrap runs (on a daily basis) How can i easily do that? Thanks!...

Can't create a new actor version and the previous ones disappeared

Here's what happened, in order: - I shared my actor with my secondary account with permissions to run, build, write and read; - On my secondary account, I tried creating a new version of my actor; - All the previous ones disappeared;...

Push to GitHub repo from Apify IDE?

Is it possible to publish/sync to a GitHub repo from the Apify web IDE? So far, I have worked primarily in the IDE and not with the SDK locally. I want to take an initial version backup to GitHub. I do not yet have a repo, but could make an empty one....

Can't subscribe to Creator Plan

I'm trying to subscribe to Creator Plan but when I click on 'Subscribe' on https://apify.com/pricing/creator-plan, it redirects to console but nothing happens. I guess this is a bug?

Issue accessing all datasets scraped on Python API.

I'm using the Python API to scrape data. I have already scraped 5+ datasets, which I'm trying to download. I want to first get a list of all these datasets, and then I'll use their IDs to download them using an async query. Now, the issue is that I seem to be unable to get a sensible list of these 5+ datasets when I use the dataset_collection_client.list() function to query. I keep getting something of this form as an output- <apify_shared.models.ListPage object at 0xXXXXcf40>...

data from google play

Hey everyone, I'm not too tech-savvy and would like to ask for help setting up an actor to scrape the data from google play. I've got a bunch of developers' website URLs, and I'm trying to scrape data about their apps from Google Play using this actor epctex/google-play-scraper. Got a couple of hurdles to tackle: 1- Any way to convert these developers' website URLs into their Google Play developer URLs? ...

Actor runs do not store output in dataset on Apify

I have an actor that is working perfectly locally - storing output in JSON files in the default dataset - but when I run it on the Apify platform, the run shows no results even though run was successful and results were found. I am attempting to store the results with: ```ts await Dataset.pushData(result)...

Build duration shows incorrectly

Build duration for all actors is showing an incorrect value in the Apify dashboard. Based on time elapsed, looks like it's using a start time of 0 or null and thus is defaulting to epoch 0 (1st Jan 1970).
No description

Instagram and Facebook Hashtag scraper

I attempted to scrap Instagram and Facebook for the hashtag #stopthesteal for a research project and got no results, which I find strange as I know there have been past campaigns on the same social media platforms using that hashtag. Would appreciate any insights into solving this problem. Thanks in advance!

structure for broad shallow scrape

Hi, I have a question about the best way to structure my solution to this problem to best take advantage of apify infrastructure and services. I need to scrape once per day a dataset off of 500 distinct domains, but each only 1-2 pages. The selectors for the items are different for almost all sites. The two extremes are 500 separate actors and one actor with a hashmap of domains to selectors for that domain. I want to be able to track when a domain has broken. What’s the best way to structure th...

LinkedIn Message

Hey guys! Quick question, its possible to use Apify to send LinkedIn messages? I see a function inside PhantomBuster and want to do the same with Apify! Thanks for the help!...

Google maps scraper

I'm hoping that someone can help me figure out what I'm doing wrong. I'm trying to scrape the directories (i.e. the shops, restaurants etc. of shopping centres but not getting the anticipated results. I've configured the scraper:...

Can't set actor environment variables

when setting environment variable from forked build, received following errors:
Error: Environment variable could not be saved (Concurrent update has been detected (object ID: N3tfo0ZvLIcrOSSAA))
Error: Environment variable could not be saved (Concurrent update has been detected (object ID: N3tfo0ZvLIcrOSSAA))
...
No description

Default Settings for Tweet Flash

I’m currently using tweet-flash and was wondering what the default parameters are for scraping ? Does the scraper have a default geographic or language focus ? Thank you for your assistance. Happy Holidays!...

Historical API results for an actor?

Hello. I was wondering if there was any way to get historical results, as well as see the input we used to perform a run for a specific actor. I want to see things like the entire json object used to perform the run for the actor, and a list of objects returned from the run. I know of a way to get basic info about historical actor runs from this endpoint https://api.apify.com/v2/actor-runs, but i'm not sure how to get more specific data, apart from going on the website and clicking on past runs. Thanks....

instagram scraping

Hi, I am trying to scrape Instagram based in hashtag search. Has anyone have any experience on scraping instagram for similar purposes? What could be some does/recommendations? #instagram...

Efficiency / Cost management of multiple smaller actors vs. one large actor

I'm using the clockworks tiktok scraper and had dilemna on whether to use 8 actors with 4gb RAM each, spreading out pages to scrape evenly accross the actors vs. 1 large 32gb actor for all the pages. It seems like from trying out the scrape its far cheaper (not much faster though), to spread out the jobs accross 8 actors (we're looking at $10-12 vs. $3-4). I was adviced from the team to try one large actor since it should scale okay, but seems like my findings are different? Am I doing something wrong here, or is it just that this scrape is better with me batching the jobs....
No description

Cannot read private member #targetManager

Hi guys, I am using Crawlee launchPuppeteeer method at when I try to use the methods browser.waitForTarget or browser.pages, I get the error: Cannot read private member #targetManager. I want to have access to other tabs, to authenticate into the website. How can I solve this?

woocommerce

How do I connect apify to woocommerce products ? #apify-platform