Crawlee & Apify

CA

Crawlee & Apify

This is the official developer community of Apify and Crawlee.

Join

crawlee-js

apify-platform

crawlee-python

💻hire-freelancers

🚀actor-promotion

💫feature-request

💻devs-and-apify

🗣general-chat

🎁giveaways

programming-memes

🌐apify-announcements

🕷crawlee-announcements

👥community

using Langchain ApifyWrapper

Hello! Is there a way I can retrieve other elements from the webpage like author and published date? when I add them in the metadata section it fails to get it....

Effective TripAdvisor scraping

I have gotten a task of gathering some information about restaurants in a specific area. I am now manually copying every link from the different places I want to pull information from with the "Tripadvisor Scraper" by Maximillian Copelli. Is here anyone with experience from gather mails, phonenumbers, addresses etc from TA who can help me streamline this process? Thanks in advance! - Mats

Cannot run this public Actor with Creator plan

After upgrade to Creator plan. I could not call actor apify/instagram-reel-scraper. I got this error You cannot run this public Actor. Your current plan does not support running public Actors.. Anyone tell me why and how to fix?...

Can I get Google reviews only for a list of preselected businesses?

I'm utilizing Apify to extract negative reviews about cleanliness at specific hotels. Can I input my pre-established hotel list to retrieve reviews exclusively for those establishments?

Loading files along with HTML-scraped content via LangChain's ApifyDatasetLoader

The ApifyDatasetLoader for LangChain loads the records, which include the text, metadata, and fileUrl fields. All of the examples show loading content via the text or metadata fields — but what about fileUrl? Assuming the run has records for PDF, XLSX, and/or other files, is there an example of how to load those files alongside the scraped HTML content?

Apify Actor Not Loading (Day 2)

Hello, this is the second day where my actors are not loading for me - It just takes me to the "just a moment" page and never gets out of that page. This is for the clockworks actor for tiktok video scraper. Any help would be greatly appreciated as my work depends on it 😩 Note: since I cannot get the actor to load I cannot submit an "issue"...

We have disabled the system which sent notifications about finished runs

I've got some scrapers failing, but I learned only because my production had no data. Why? Because I've relied on emails from Apify, and they didn't come. Why? Because you've disabled them without telling me 😀 :thisisfinefire: C'mon. I'm fine about disabling email feature you want to revamp, but then please email me about the change, because I don't spend my life on the notifiactions tab to learn about the change from a little notification box 🤦‍♂️ I don't consider the notifications perfect, but they are the only global way to get notified about the fact that actors failed to finish. Alternative is to setup alert, but then I have to manually click on all my ten actors to set it up (prone to errors), and, more importantly there is no way to setup alert for a failed actor. In the metric drop down, there is no exit status is 0 option or equlvalent, as far as I can see....
No description

Most Recent Tweets Scraper

Hello, Over the weekend, web.harvester/twitter-scraper was depreciated which is very disappointing since it did exactly what I needed which was to take in a list of handles and return all tweets (including images) from those handles for the past X days. I am trying to find a reliable alternative which can accomplish the same task. I was looking at quacker/twitter-scraper, but it cannot pull most recent and both the actors it suggests for that are broken/depreciated....

How do I get a run's input after run is complete?

I am using the Google Maps Scraper actor and providing a search query and location. I'm then using a webhook to let my next.js app know when the run is finished so it can grab the results. I'd like to also grab the original search query and location. How would I do this?

Facebook Ads Details API

Hi, I am struggling with these details, How do i fetch these details, i was not able to find details regarding this on facebook ads library api page. Can i have the url parameter for this. Thank you...
No description

Actors for a specific GPT

Greetings, I want to create an OpenAI GPT to support my business marketing and SEO goals. I would like it to be able to read the content of my website, crawl internal and external links. I would also like to answer questions about lead generation, marketing funnel building. Which actors should I use for this purpose? Thanks in advance

Passing the output of an actor to another and run it Python API

Greetings, i started reading the docs of a tool scraper i want to use from your site, i found how to use the python api and get the results of actors but the next most important factor for me is to be able to get the output of an actor and then give it as an input to another task. Is that possible, and if yes how? i cant find it in the docs or in help sections here. Thanks

Run Queue

Hi all, is there a way to put actor/task runs in queue when there is no memory left. My use case is a schedule may contain many tasks so when the schedule triggers, it can run out of memory and fail some tasks.

limit dataset items

Is there any way to limit the amount of items in a dataset? besides the obvious, kind of like limit param

Apify CLI question

Hello guys 🙂 I am trying to deploy my actor to Apify and I am getting following error: Error: File name cannot exceed 100 characters may I ask what does it mean? 🫣🤔 Like I have file that contains too many characters? I look at my files and I don't see any that has more then 100 characters 🤔 thank you for the answer 🫶🤗...

Daily WebScraper Help

My goal is to scrape ftsearch.auction daily for new listings. The scraper will have to select the auction end date to be 1 day later than the current date. then scrape the listings available. Can someone help me figure out how to do this?

Proxies

Am I able to utilise my own proxy when scraping instead of Apify's provided proxies? It appears one site I am attempting to scrape requires a captcha when accessing via Apify's proxies.

Subscribing to changelog using RSS?

I found https://apify.com/change-log and I'd like to subscribe using RSS. Is that possible? Reverse-engineersing I see the page makes requests to https://cms.apify.com/api/change-log-items?pagination[limit]=-1&populate=deep, but I'm not sure what cms.apify.com is and whether it's able to give me a good old RSS or Atom feed....
No description

Suggestion: Notify me only when my actor fails

There are these two (see attachment) settings which I realized I first didn't understand correctly. I thought I can uncheck report about all my actor runs, and get only failed actor runs by leaving the other checkbox checked. But it seems that "Actor Issues" is something completely different, probably related to the actor marketplace. To get notified about failed actors, I either have to setup monitoring on all of them manually (e.g. alert every time there is 0 items in the dataset), or I have to check the top checkbox and get notified about all runs, even the successful ones (which is noisy if I have daily schedule). So I suggest there could be a checkbox which notifies me only when the actors fail, while not sending anything when the actors are successful....
No description

Not getting right content when crawling Levis website

Using Apify Website Content Crawler I crawled https://www.levi.com/US/en_US/clothing/men/jeans/straight/501-original-fit-mens-jeans/p/005010115 , instead of getting product details content I am getting just the following content which does not make any sense. Can someone please help me what has gone wrong ? Attached input params json file Returned Content: Installments by 4 interest-free payments due every 2 weeks when you select Afterpay at checkout Select Afterpay as your payment method at checkout Available on orders $35 - $1,000. All you need to apply is your debit or credit card. Complete your checkout No long forms and you'll receive an instant approval decision. Pay over 4 equal payments Enjoy your purchase right away! Pay every two weeks with zero interest and no fees when you pay on time. Afterpay not available on orders with Gift Cards. You must be over 18, a resident of the U.S. and meet additional eligibility criteria to qualify. Estimated payment amounts shown on product pages exclude taxes and shipping charges, which are added at checkout. Late Fees apply. Click here Terms & Conditions. ©2020 Afterpay...