Crawlee & Apify

CA

Crawlee & Apify

This is the official developer community of Apify and Crawlee.

Join

crawlee-js

apify-platform

crawlee-python

💻hire-freelancers

🚀actor-promotion

💫feature-request

💻devs-and-apify

🗣general-chat

🎁giveaways

programming-memes

🌐apify-announcements

🕷crawlee-announcements

👥community

fascinating-indigo
fascinating-indigo9/26/2023

Get tweets by date

Hi. Is it possible to get most recent tweets of a twitter user? Preferably since a datetime (e.g. from yesterday)?
flat-fuchsia
flat-fuchsia9/25/2023

HELP! I need to crawl website with sidebar live comments

At first I thought that the site uses something like an api, but upon closer inspection it uses websocket (socket.io). How should I approach this problem?
fascinating-indigo
fascinating-indigo9/23/2023

Crawlee or APIfy tooling?

I'd like to get started with APIfy but down the line I may end up using my own infra. How hard is it to switch from APIfy's tooling (cli etc) to Crawlee? I understand APIfy is built on top of crawlee, so I imagine it's possible but how much friction/lock-in is there...

google drive integration question

Hi, I need to upload csv file to the google drive. I see there is some new integration tab on apify now, so I have some questions. Sadly there is a lot of settings but no documentation. Is it possible to upload via this integration file from the key value store? I have managed to convert csv file to json then push it to the dataset and then push it via the integration to the drive but there are some issues. 1. I do not know how to preserve the filename. 2. converting csv to json change the data a little....
equal-aqua
equal-aqua9/21/2023

Webhooks - website content crawler

Hey, I'm using website content crawler actor to crawl website. I want to show crawl status update on the client's side like what pages it's found, I've looked into integration/webhooks for this but I can only get ACTOR.RUN.CREATED and ACTOR.RUN.SUCCEEDED', is there any other way I can get the status updates? thanks
fair-rose
fair-rose9/21/2023

ERR_CERT_AUTHORITY_INVALID Issue with Apify Proxy

Hello, I'm using an Apify Proxy in combination with Playwright and ran into an ERR_CERT_AUTHORITY_INVALID error when trying to visit certain sites. Playwright code (in node): ```javascript const proxyRouter = ProxyRouter({...
stormy-gold
stormy-gold9/21/2023

Error when building due to better-sqllite3

My program runs locally, but when I run apify push I get the following error. Any help would be greatly appreciated!
generous-apricot
generous-apricot9/20/2023

apify/actor-node-puppeteer-chrome:18 builder problem

Hi. I little noobish question. How to add WORKDIR in dockerfile to make it work? I want to use volumes in docker-compose, but can't achive that. Here is my dockerfile: ```docker Specify the base Docker image. You can read more about the available images at https://crawlee.dev/docs/guides/docker-images...
absent-sapphire
absent-sapphire9/20/2023

Google Maps Email Extractor Actor get frozen

Hi everyone I'm currently using the actor 📩📍 Google Maps Email Extractor And the detail scraper that is launch stay blocked at 19 results... the logs froze also I've already tried to reboot and restart the actor from the begenning but the actor froze again (on the 35th result this time, not the same) Any ideas ? Thx...
sensitive-blue
sensitive-blue9/18/2023

Easy Twitter Search Scraper duplicates output

Hi, I am new to Apify and tried Twitter Search Scrapper. When I analyze the output, I found multiple duplicates for one keyword in the output. Is there something wrong?
No description
afraid-scarlet
afraid-scarlet9/18/2023

Is there a way to add a header the ad-hoc webhook?

One might suspect something like this should be possible: `await Actor.addWebhook({ eventTypes: ['ACTOR.RUN.SUCCEEDED'], requestUrl: process.env.RUN_SUCCEEDED_WEBHOOK_URL,...
afraid-scarlet
afraid-scarlet9/18/2023

Get the dataset from the apify-js API client based on the actorRunId?

Is it possible? I can only find actorClient.lastRun() This seems an extremely obvious use case. I did find it here https://docs.apify.com/api/v2#/reference/actor-runs/run-collection/get-dataset but that seems to be for URL's only.
flat-fuchsia
flat-fuchsia9/17/2023

I can't deploy code as written in apify docs

I want to learn how to deploy local code to apify platform, so I followed tutorial from here https://docs.apify.com/academy/deploying-your-code. Here my published actor https://apify.com/encouraged_keyboard/firstpush...
ratty-blush
ratty-blush9/16/2023

Why only 200 results from Google Search scrapers...?

I tried 2 different actors for scraping Google search results (Google Search Results Scraper and Fast Google Search Scraper). Both of them stopped at 200 results, declaring "success", even though I specified a much higher limit. What am I missing?
flat-fuchsia
flat-fuchsia9/16/2023

Is there a way to filter/sort questions by number of followers in the Quora Scrapper?

Is there a way to filter/sort questions by number of followers in the Quora Scrapper?
like-gold
like-gold9/15/2023

How to set the memory using apify-client?

How can I set the memory of an actor when using the apify-client js client? I didn't see any documentation on this. Is it possible?
frail-apricot
frail-apricot9/14/2023

How to parse posts in group with commenters and reactors?

I need to analyze one group, all posts from a single member, and identify the individuals who commented on and liked their posts. Group Parser give me just top comments and only number of likes.
rare-sapphire
rare-sapphire9/14/2023

Webhook security

Hello, I am interested in setting a webhook to receive events when a particular actor run has succeeded. However, I want to be able to secure my endpoint to ensure that only Apify can invoke it. I found an earlier post from June suggesting that this isn't possible but the Apify documentation suggests otherwise: https://discord.com/channels/801163717915574323/1115873908046966864 ...