Crawlee & Apify

CA

Crawlee & Apify

This is the official developer community of Apify and Crawlee.

Join

crawlee-js

apify-platform

crawlee-python

💻hire-freelancers

🚀actor-promotion

💫feature-request

💻devs-and-apify

🗣general-chat

🎁giveaways

programming-memes

🌐apify-announcements

🕷crawlee-announcements

👥community

metropolitan-bronze
metropolitan-bronze11/9/2023

I am not able to run this particular actor with actor id : tHbtPTjFCTukBZvcH

The following message is show. Please help me with this.
No description
noble-gold
noble-gold11/8/2023

Getting past google's signin page

I'm writing a script for scraping part of google maps. I've got chromium to open, go to google maps, click on the cookie consent button, but I'm not sure how to get past the google sign in. What's the secret for this? Thanks....
ratty-blush
ratty-blush11/8/2023

Failure to run

I've encoutered the following issue. Any comments why is that ?...
No description
xenial-black
xenial-black11/7/2023

Telegram scraper and adder

why telegram scraper and adder is showing error of FLOOD ERROR: sleeping for 42130seconds. If the hole free trial period is going to waste on error, how would I review and purchase the premium. Please help me.
quickest-silver
quickest-silver11/7/2023

Error: Error: Cannot find module '/home/myuser/dist/main.js'

I am getting the error: Error:
Cannot find module '/home/myuser/dist/main.js'
Cannot find module '/home/myuser/dist/main.js'
when trying to run a successfull build for my actor on Apify, but locally it works without errors. ...
xenial-black
xenial-black11/5/2023

Facebook Post Checker

I tried to use Facebook Post Checker to retrieve information from https://www.facebook.com/profile.php?id=100078638665862 (facebook public page) World Military _ ანალიტიკა from 24.02.2022 to 30.08.2023. However, when I entered the name and set max and min date, the results weren't returned? Could it be the problem of the facebook public page name?
equal-aqua
equal-aqua11/4/2023

Actor Source code (contact scraper)

Hi guys, Does anyone happen to know if it’s possible to get hold of the source code for a popular contact details scraper ? (Actor : https://apify.com/vdrmota/contact-info-scraper) It appears to have a GitHub link to the source code but doesn’t appear to be valid anymore. ...
fair-rose
fair-rose11/3/2023

Getting the metadata of a run, specifically Usage and Duration, with the js apify client or api?

I want to do this so that I can more easily choose the best way for me to run an actor (compare runs with different memory sizes for example) or even compare different actors that solve the same issues. I've been looking at the apify js client documentation & at the apify api documentation and didn't find anything, but maybe I just missed it?...
extended-yellow
extended-yellow11/2/2023

Scrape tiktok bio AND location of user

I need to input motorcycling as a hashtag, then scrape all users from that hashtag, but i NEED their bio AND location Right now i have to use 2 different scrapers to get the hashtag profiles THEN the location of the user Please help!...
extended-yellow
extended-yellow11/2/2023

tiktok

I need to input motorcycling as a hashtag, then scrape all users from that hashtag, but i NEED their bio AND location Right now i have to use 2 different scrapers to get the hashtag profiles THEN the location of the user Please help!...
national-gold
national-gold11/2/2023

Apify TikTok Scraper - Help

Hello, I used the TikTok comment scraper and got about 500 comments from a video with 77000 comments. Which comments are being scraped? In which order are the comments listed? Can I choose myself which comments should be scraped?...
optimistic-gold
optimistic-gold11/1/2023

multiple users with tweet flash

Hello, I'm trying to use tweet flash to colloct tweets from mulitple profiles. However, I keep running into this issue where it only returns the first persons tweets. Sometimes it is able to pull multiple depending on the perameters but other times it just doesnt work. Like, if I set the max tweet to 250 it wasn't working, but 100 was. However, if max tweet was 100 and there was a specific time time frame, it wasnt working. Anyone run into this before? I tried both on the apify website and throu...
equal-aqua
equal-aqua10/31/2023

Best Plan for Google Maps API

Hello I am working on a project where I am looking for all the self storage facilities in the US that are listed with google. After doing some testing using 4GB of RAM in the job, each town takes about 7 min to run. It would take months of continuous running at this rate to cover the US. Should I just get the Starter package and run it at 4-6 times the RAM? Or should I figure out a way to run jobs in parallel, 2,3,4 at a time?
xenial-black
xenial-black10/29/2023

Help me!

I am going to send the key event to the wxpython dialog to input in the focued field.But can't Please help me...
like-gold
like-gold10/28/2023

Google Search Result Scraper Output Order/Sorting

Guys, can anyone help me with the Google Search Result Scraper actor? For example i input these keywords to scrape: Keyword 1 Keyword 2...
variable-lime
variable-lime10/27/2023

Sharepoint pages crawlable?

Hello, is it possible to crawl Sharepoint pages that lie behind an Auth layer? Would this be generally possible and has anyone experience with this? Thank you.
sunny-green
sunny-green10/25/2023

Scrapy reactor bug

Hello, so I did a script in vsc locally (apify run) and it worked fine. Then, I pushed it and tried to run it from the apify platform and it returns this error I downloaded the reactor like the template asked to install the reactor which I did
install_reactor('twisted.internet.asyncioreactor.AsyncioSelectorReactor')
install_reactor('twisted.internet.asyncioreactor.AsyncioSelectorReactor')
and defined it aswell
settings = get_project_settings()
settings['TWISTED_REACTOR'] = 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'
settings = get_project_settings()
settings['TWISTED_REACTOR'] = 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'
...
rare-sapphire
rare-sapphire10/24/2023

Help regarding request_queue

I am building a website scraper for my users. I want to support upto x no.of child URLs to be scrapable, starting from the startUrl. In somecases, I am seeing duplicate links to be scraped. And in somecases, the no.of urls identified goes into the order of 1000s. I want to control the enqueuing of the urls into the request_queue, to avoid unnecessary costs and duplication of URLs that are being scraped. Here is my enque function: ```javascript const enqueued = await enqueueLinks({...
dependent-tan
dependent-tan10/23/2023

apify cli hangs

Hi, Im using debian 12 with gnome and every apify run the client hangs, it does its job and it gets to the end but the process itself doent finish, any ideas on this or how to fix it? thanks!
adverse-sapphire
adverse-sapphire10/22/2023

Private web scraping?

Hello Friends! I'm new to Apify and pretty excited about what I've learned so far. One use case I'm not sure of: Can Apify be used to scrape a website that's not on the public internet? Specifically, I want to scrape knowledgebases inside corporations (with their permission). Is there for example some sort of proxy that could be put in place inside the private network that connects with Appify and then scrapes at Apify's direction? Or etc? (I've cross-posted this in the Crawlee help forum; hopef...