CA
Crawlee & Apify12mo ago
national-gold

Website Content Crawler vs Web Scraper

I noticed Apify makes a Website Content Crawler and 3 types of scrapers (Web Scraper, Cheerio, Playright). What's the difference between the Website Content Crawler vs these older scrapers? They seem to both crawl and scrape?
3 Replies
sensitive-blue
sensitive-blue12mo ago
Hey. You can find all needed info here: https://docs.apify.com/academy/apify-scrapers
Apify scrapers | Academy | Apify Documentation
Discover Apify's ready-made web scraping and automation tools. Compare Web Scraper, Cheerio Scraper and Puppeteer Scraper to decide which is right for you.
national-gold
national-goldOP12mo ago
This seems to compare the 3 scrapers but it doesn't compare Apify's "Web Scraper" to the newer "Web Content Crawler"
sensitive-blue
sensitive-blue12mo ago
What specifically are you asking about? You can refer to the README section for both actors. Website Content Crawler: This actor is designed to extract data for feeding, fine-tuning, or training large language models (LLMs) like GPT-4, ChatGPT, or LLaMA. Web Scraper: The Web Scraper is a versatile and easy-to-use actor for crawling web pages and extracting structured data using just a few lines of JavaScript. It loads web pages in a Chromium browser to render dynamic content. You can configure and run it manually via the user interface or programmatically using the API. The extracted data is stored in a dataset, which can be exported in formats such as JSON, XML, or CSV. Depending on your needs, you can choose the scraper that best fits your purpose. If you don't plan to use extracted datawith AI stuff, "Web Scraper" is the option to go.

Did you find this page helpful?